Realm of Chaos

Caution: this post is for nerds only.

I watched this numberphile video last night and was fascinated by the point pattern that was created in it. I thought I would quickly program my own version to recreate it and then look at patterns made by more points.

I didn’t realise until afterwards that there is actually a web version of the program used in the video here. It is a bit limited though so my code was still worthwhile.

A fractal triangular pattern can be created by:

  1. Setting three points
  2. Picking a randomly placed seed point
  3. Rolling a die and going halfway towards the result
  4. Repeat last step

If the first three points are randomly placed the pattern is skewed, so I added the ability to generate an equilateral triangle. Here is the result.

and here are the results of a triangle through to a decagon.

All of these are generated with one million points using alpha=0.25. The triangle, pentagon and hexagon make nice patterns but the square and polygons with more than six points make pretty uninteresting patterns.

Watching the creation of the point pattern from a triangular set is quite fun. This is 30000 points with a frame every 10 points.

Here is the code.

Some other notes: this version runs in IgorPro. In my version, the seed is set at the centre of the image rather than a random location. I used the random allocation of points rather than a six-sided dice.

The post title is taken from the title track from Bolt Thrower’s “Realm of Chaos”.

Notes To The Future

Previously I wrote about our move to electronic lab notebooks (ELNs). This post contains the technical details to understand how it works for us. You can even replicate our setup if you want to take the plunge.

Why go electronic?

Lots and lots of lab books and folders.

Many reasons: I wanted to be able to quickly find information in our lab books. I wanted lab members to be able to share information more freely. I wanted to protect against loss of a notebook. I think switching to ELNs is inevitable and not only that I needed to do something about the paper notebooks: my group had amassed 100 in 10 years.

We took the plunge and went electronic. To recap, I decided to use WordPress as a platform for our ELN.

Getting started

We had a Linux box on which I could install WordPress. This involved installing phpMyAdmin and registering a mySQL database and then starting up WordPress. If that sounds complicated, it really isn’t. I simply found a page on the web with step-by-step instructions for my box. You could run this on an old computer or even on a Raspberry Pi, it just has to be on a local network.

Next, I set myself up as admin and then created a user account for each person in the lab. Users can have different privileges. I set all people in the lab to Author. This means they can make, edit and delete posts. Being an Author is better than the other options (Contributor or Editor) which wouldn’t work for users to make entries, e.g. Contributors cannot upload images. Obviously authors being able to delete posts is not acceptable for an ELN, so I removed this capability with a plugin (see below).

I decided that we would all write in the same ELN. This makes searching the contents much easier for me, the PI. The people in the lab were a bit concerned about this because they were each used to having their own lab book. It would be possible to set up a separate ELN for each person but this would be too unwieldy for the PI, so I grouped everyone together. However, it doen’t feel like writing in a communal notebook because each Author of a post is identifiable and so it is possible to look at the ELN of just one user as a “virtual lab book”. To do this easily, you need a plugin (see below).

If we lost the WP installation it would be a disaster, so I setup a backup. This is done locally with a plugin (see below). Additionally, I set up an rsync routine from the box that goes off weekly to our main lab server. Our main lab server uses ZFS and is backed up to a further geographically distinct location. So this is pretty indestructible (if that statement is not tempting fate…). The box has a RAID6 array of disks but in the case of hardware failure plus corruption and complete loss of the array, we would lose one week of entries at most.

Theme

We tried out a few before settling on one that we liked. We might change and tweak this more as we go on.

The one we liked was called gista. It looks really nice, like a github page. It is no longer maintained unfortunately. Many of the other themes we looked at have really big fonts for the posts, which gives a really bloggy look, but is not conducive to a ELN.

Two things needed tweaking for gitsta to be just right: I wanted the author name to be visible directly after the title and I didn’t want comments to show up. This meant editing the content.php file. Finally, the style.css file needs changing to have the word gista-child in the comments, to allow it to get dependencies from gitsta and to show up in your list of themes to select.

The editing is pretty easy, since there are lots of guides online for doing this. If you just want to download our edited version to try it, you can get it from here (I might make some more changes in the future). If you want to use it, just download it, rename the directory as gitsta-child and then place it in WordPress/wp-content/themes/ of your installation – it should be good to go!

Plugins

As you saw above, I installed a few plugins which are essential for full functionality

  • My Private Site – this plugin locks off the site so that only people with a login can access the site. Our ELN is secure – note that this is not a challenge to try to hack us – it sits inside our internal network and as such is not “on the internet”. Nonetheless, anyone with access to the network who could find the IP could potentially read our ELN. This plugin locks off access to everyone not in our lab.
  • Authors Widget – this plugin allows the addition of a little menu to the sidebar (widget) allowing the selection of posts by one author. This allows us to switch between virtual labbooks for each lab member. Users can bookmark their own Author name so that they only see their labbook if they want.
  • Capability Manager Enhanced – you can edit rights of each level of user or create new levels of user. I used this to remove the ability to delete posts.
  • BackWPup – this allows the local backup of all WP content. It’s highly customisable and is recommended.

Other plugins which are non-essential-but-useful

  • WP Statistics – this is a plugin that allows admin to see how many visits etc the ELN has had that day/week etc. This one works on a local installation like ours. Others will not work because they require the site to be on the internet.
  • WP-Markdown – this allows you to write your posts in md. I like writing in md, nobody in my lab uses this function.

Gitsta wants to use gust rather than the native WP dashboard. But gust and md were too complicated for our needs, so I uninstalled gust.

Using the ELN

Lab members/users/authors make “posts” for each lab book entry. This means we have formalised how lab book entries are done. We already had a guide for best practice for labbook entries in our lab manual which translates wonderfully to the ELN. It’s nothing earth-shattering, just that each experiment has a title, aim, methods, results and conclusion (just like we were taught in school!). In a paper notebook this is actually difficult to do because our experiments run for days (sometimes weeks) and many experiments run simultaneously. This means you either have to budget pages in the notebook for each separate experiment, interleave entries (which is not very readable) or write up at the end (which is not best practice). With ELNs you just make one entry for each experiment and update all of them as you go along. Problem solved. Edits are possible and it is possible to see what changes have been made and it is even possible to roll back changes.

Posts are given a title. We have a system in the lab for initials plus numbers for each experiment. This is used for everything associated with that experiment, so the files are easy to find, the films can be located and databases can cross-reference. The ELN also allows us to add categories and tags. So we have wide ranging categories (these are set by admin) and tags which can be more granular. Each post created by an author is identifiable as such, even without the experiment code to the title. So it is possible to filter the view to see posts:

  • by one lab member
  • on Imaging (or whatever topic)
  • by date or in a date range

Of course you can also search the whole ELN, which is the thing I need most of all because it gets difficult to remember who did what and when. Even lab members themselves don’t remember that they did an experiment two or more years previously! So this feature will be very useful in the future.

WordPress allows pictures to be uploaded and links to be added. Inserting images is easy to show examples of how an experiment went. For data that is captured digitally this is a case of uploading the file. For things that are printed out or are a physical thing, i.e. western films or gel doc pictures, we are currently taking a picture and adding these to the post. In theory we can add hard links to data on our server. This is certainly not allowed in many other ELNs for security reasons.

In many ways the ELN is no different to our existing lab books. Our ELN is not on the internet and as such is not accessible from home without VPN to the University. This is analogous to our current set up where the paper lab books have to stay in the lab and are not allowed to be taken home.

Finally, in response to a question on Twitter after the previous ELN post: how do we protect against manipulation? Well previously we followed best practice for paper books. We used hard bound books with numbered pages (ensuring pages couldn’t be removed), Tip-ex was not allowed, edits had to be done in a different colour pen and dated etc. I think the ELN is better in many ways. Posts cannot be deleted, edits are logged and timestamped. User permissions mean I know who has edited what and when. Obviously, as with paper books, if somebody is intent on deception, they can still falsify their own lab records in some way. In my opinion, the way to combat this is regular review of the primary data and also maintaining an environment where people don’t feel like they should deceive.

The post title is taken from “Notes To The Future” by Patti Smith , the version I have is recorded Live in St. Mark’s Church, NYC in 2002 from Land (1975-2002). I thought this was appropriate since a lab note book is essentially notes to your future self. ELNs are also the future of taking notes in the lab.

The Soft Bulletin: Electronic Lab Notebooks

We finally took the plunge and adopted electronic lab notebook (ELNs) for the lab. This short post describes our choice of software. I will write another post about how it’s going, how I set it up and other technical details.

tl;dr we are using WordPress as our ELN.

First, so you can understand my wishlist of requirements for the perfect ELN.

  1. Easy-to-use. Allow adding pictures and notes easily.
  2. Versioning (ability to check edits and audit changes)
  3. Backup and data security
  4. Ability to export and go elsewhere if required
  5. Free or low cost
  6. Integration with existing lab systems if possible
  7. Open software, future development
  8. Clarity over who owns the software, who owns the data, and where the information is stored
  9. Can be deployed for the entire lab

There are many ELN software solutions available, but actually very few fulfil all of those requirements. So narrowing down the options was quite straightforward in the end. Here is the path I went down.

Evernote

I have used Evernote as my ELN for over a year. I don’t do labwork these days, but I make notes when doing computer programming, data analysis and writing papers. I also use it for personal stuff. I like it a lot, but Evernote is not an ELN solution for a whole lab. First, there is an issue over people using it for work and for personal stuff. How do we archive their lab documents without accessing other data? How do we pay for it? What happens when they leave? These sorts of issues prevent the use of many of the available ELN software packages, for a whole lab. I think many ELN software packages would work well for individuals, but I wanted something to deploy for the whole lab. For example, so that I can easily search and find stuff long after the lab member has left and not have to go into different packages to do this.

OneNote

The next most obvious solution is OneNote from Microsoft. Our University provides free access to this package and so using it would get around any pricing problems. Each lab member could use it with their University identity, separating any problems with work/life. It has some nice features (shared by Evernote) such as photographing documents/whiteboards etc and saving them straight to notes. I know several individuals (not whole labs) using this as their ELN. I’m not a big fan of running Microsoft software on Macs and we are completely Apple native in the lab. Even so, OneNote was a promising solution.

I also looked into several other software packages:

I liked the sound of RSpace, but it wasn’t clear to me who they were, why they wanted to offer a free ELN service and where they would store our data and what they might want to do with it. Last year, the scare that Evernote were going to snoop on users’ data made me realise that when it came to our ELNs – we had to host the data. I didn’t want to trust a company to do this. I also didn’t want to rely on a company to:

  • continue to do what we sign up for, e.g. provide a free software
  • keep updating the software, e.g.  so that macOS updates don’t kill it
  • not sell up to an evil company
  • do something else that I didn’t agree with.

As I saw it, this left one option: self-hosting and not only that, there were only two possibilities.

Use a wiki

This is – in many ways – my preferred solution. Wikis have been going for years and they are widely used. I set one up and made a lab notebook entry. It was great. I could edit it and edits were timestamped. It looked OK (but not amazing). There were possibilities to add tables, links etc. However, I thought that doing the code to make an entry would be a challenge for some people in the lab. I know that wikis are everywhere and that editing them is simple, but I kept thinking of the project student that comes to the lab for a short project. They need to read papers to figure out their project, they have to learn to clone/run gels/image cells/whatever AND then they also have to learn to write in a wiki? Just to keep a log of what they are doing? For just a short stay? I could see this meaning that the ELN gets neglected and things didn’t get documented.

I know other labs are using a wiki as an ELN and they do it successfully. It is possible, but I don’t think it would work for us. I also needed to entice people in the lab to convert them from using paper lab notebooks. This meant something that looked nice.

Use WordPress

This option I did not take seriously at first. A colleague told me two years ago that WordPress would be the best platform for an ELN, and I smiled politely. I write this blog on a wordpress dot com platform, but somehow didn’t consider it as an ELN option. After looking for alternatives that we could self-host, it slowly dawned on me that WordPress (a self-hosted installation) actually meets all of the requirements for an ELN.

  1. It’s easy-to-use. My father, who is in his 70s, edits a website using WordPress as a platform. So any person working in the lab should be able to do it.
  2. Versioning. You can see edits and roll back changes if required. Not as granular as wiki but still good.
  3. Backup and data security. I will cover our exact specification in a future post. Our ELN is internal and can’t be accessed from outside the University. We have backup and it is pretty secure. Obviously, self-hosting means that if we have a technical problem, we have to fix it. Although I could move it to new hardware very quickly.
  4. Ability to export and go elsewhere if required. It is simple to pack up an xml and move to another platform. The ubiquity of WordPress means that this will always be the case.
  5. Free or low cost. WordPress is free and you can have as many users as you like! The hardware has a cost, but we have that hardware anyway.
  6. Integration with existing lab systems if possible. We use naming conventions for people’s lab book entries and experiments. Moving to WordPress makes this more formal. Direct links to the primary data on our lab server are possible (not necessarily true of other ELN software).
  7. Open software, future development. Again WordPress is ubiquitous and so there are options for themes and plugins to help make it a good ELN. We can also do some development if needed. There is a large community, meaning tweaking the installation is easy to do.
  8. Clarity over who owns the software, who owns the data, and where the information is stored. It’s installed on our machines and so we don’t have to worry about this.
  9. It can be deployed for the whole lab. Details in the follow-up post.

It also looks good and has a more up-to-date feel to it than a wiki. A screenshot of an innocuous lab notebook entry is shown to the right. I’ve blurred out some details of our more exciting experiments.

It’s early days. I started by getting the newer people in the lab to convert. Anyone who had only a few months left in the lab was excused from using the new system. I’m happy with the way it looks and how it works. We’ll see how it works out.

The main benefits for me are readability and being able to look at what people are doing. I’m looking forward to being able to search back through the entries, as this can be a serious timesuck with paper lab notebooks.

Edit 2017-04-26T07:28:43Z After posting this yesterday a few other suggestions came through that you might want to consider.

Labfolder, I had actually looked at this and it seems good but at 10 euros per user per month, I thought it was too expensive. I get that good software solutions have a cost and am not against paying for good software. I’d prefer a one-off cost (well, of course I’d prefer free!).

Mary Elting alerted me to Shawn Douglas’s lektor-based ELN. Again this ticks all of the boxes I mentioned above.

Manuel Théry suggested ELab. Again, I hadn’t seen this and it looks like it meets the criteria.

The Soft Bulletin is an occasional series of posts about software choices in research. The name comes from The Flaming Lips LP of the same name.

 

The Soft Bulletin: PDF organisation

I recently asked on Twitter for any recommendations for software to organise my PDFs. I got several replies, but nothing really fitted the bill. This is a brief summary.

My situation

I have quite a lot of books, textbooks, cheat sheets, manuals, protocols etc. in PDF format and I need a way to organise them. I don’t need to reference this content, I just need to search it and access it quickly – ideally across several devices.

Note: I don’t collect PDFs of research articles. I have a hundred or so articles that were difficult to get hold of, and I keep those, but I’m pretty complacent about my access to scholarly literature.

I currently use Papers2 for storing my PDFs. It’s OK, but there are some bugs in it. Papers3 came out a few years ago, but I didn’t do the upgrade because there are issues with sync across multiple computers. Now it doesn’t look like Papers will be supported in the future. For example, I heard on Twitter that there is no ETA for an issue with Papers3 on Sierra. Future proofing – I’ve come to realise – is important to me as I am pretty loyal to software, I don’t like to change to something else, but I do like new features and innovation.

I don’t need a solution for referencing. I am resigned to using EndNote for that.

Ideally I just want something like iTunes to organise my PDFs, but I don’t want to use iTunes! Perhaps my requirements are too particular and what I want just isn’t available.

The suggestions

Thanks to everyone who made suggestions. Together with other solutions they were (in no particular order):

Zotero

www.zotero.org

I downloaded this and gave it a brief try. PDF import worked well and the UI looked OK. I stumbled on the sync capabilities. I currently sync my computers with Unison and this is complicated (but not impossible) to do for Zotero. They want you to use cloud syncing – which I would probably be OK with. I need to test out which cloud service is best to use. There is a webDAV option which my University supports and I think this would work for me. I think this software is the most likely candidate for me to switch to.

Mendeley

www.mendeley.com

This software got the most recommendations. I have to admit that the Elsevier connection is a huge turn-off for me. Although the irony of using it to organise my almost exclusively Elsevier-free content would be quite nice. I know that most of this type of software has been bought out by the publishing giants (Papers by Springer, EndNote by Thomson Reuters/Clarivate), but I don’t like this and I don’t have to support it if I don’t want to. I didn’t look into sync capabilities here.

Bookends

www.sonnysoftware.com

People rave about this software package for Mac. I like the fact that it has a separate lineage to the other packages. It is very expensive and it is primarily a referencing package. Right now, I’m just looking for something to organise my PDFs and this seems to be overkill.

Evernote

www.evernote.com

I use Evernote as a lab notebook and it is possible to use it to store PDFs. You can make a NoteBook for them, add a Note for each one and attach the PDF. The major plus here is that I already use it (and pay for it). The big negative is that I would prefer a separate standalone package to organise my PDFs. I know, difficult to please aren’t I?

Finder and Spotlight

This is the D.I.Y. option.

I have to say that this is the most appealing in many ways. If you just name PDFs systematically and store them in a folder hierarchy that you organise and tag – it would work. Sync would work with my current solution. Searching with Spotlight would work just as well as any other program. I would not need another program! At some point in the past I organised my PDFs like this. I moved to storing them in Papers so that it would save them in a hierarchical structure for me. This is what I mean by an iTunes-like organiser. An app to name, tag and file-away the PDFs would be ideal. I don’t want to go back to this if I can help it.

ReadCube

www.readcube.com

Like Mendeley, this is an option that I did not seriously entertain. I think this is too far away from what I want. As I see it, this software is designed as a web extension and paper recommendation service, which is not what I’m looking for.

Papers3

papersapp.com

As mentioned above, the lack of updates to this software and problems with sync mean that I am looking for something else. I really liked Papers2 and would be happy to continue using this if various things like import and editing were improved. I guess the option here is to stay with Papers2 and put up with the little things that annoy me. At some point though there will be a macOS update which breaks it and then I will be stuck.

Endnote
endnote.com

I use Endnote for referencing. I hate Endnote with a passion. But I can use it. I know how to write styles etc. and edit term lists because I’ve used it since something like v3. At some point in the past I began to store papers in Endnote. I stopped doing this and moved to Papers2. I have to admit it’s OK at doing this, although the way it organises the PDFs on disk is a bit strange IMO. I don’t like storing books and other content in my library though so this is not a good solution.

iBooks

Here is a curveball. I use iBooks and Kindle app for reading books in mobi/epub/pdf format. Actually, iBooks works quite well for PDFs and has the ability to sync with other devices. I have a feeling this could work, although some of the PDFs I have are quite bulky and I’d need to figure out a way for them to stay in the cloud and not reside on mobile devices. It’s definitely designed for reading books and not for pulling up the PDF in Preview and quickly finding a specific thing. For this reason I don’t think it would work.

Note that there are other apps for this task. Also, if you search for “PDF” in the App Store, there plenty of other programs aimed at people outside academia. Maybe one of those would be OK.

So what did I do?

I doubt anyone has the precise requirements that I have and so you’re probably not interested in what I decided. However, the simplest thing to do was to import the next batch of PDFs into Papers2 and wait to see if something better comes along. I will try Zotero a bit more when I get some time and see if this is the solution for me.

The post title is taken from The Flaming Lips’ 1999 album “The Soft Bulletin”.

Bateman Writes: 1994

BBC 6Music recently went back in time to 1994. This made me wonder what albums released that year were my favourites. As previously described on this blog, I have this information readily available. So I quickly crunched the numbers. I focused on full-length albums and, using play density (sum of all plays divided by number of album tracks) as a metric, I plotted out the Top 20.

1994

 

 

There you have it. Scorn’s epic Evanescence has the highest play density of any album released in 1994 in my iTunes library. By some distance. If you haven’t heard it, this is an amazing record that broke new ground and spawned numerous musical genres. I think that record, One Last Laugh In A Place of Dying… and Ro Sham Bo would all be high on my all-time favourite list. A good year for music then as far as I’m concerned.

Other observations: I was amazed that Definitely Maybe was up there, since I am not a big fan of Oasis. Likewise for Dummy by Portishead. Note that Oxford’s Angels and Superdeformed[…] are bootleg records.

Bubbling under: this was the top 20, but there were some great records bubbling under in the 20s and 30s. Here are the best 5.

  • Heatmiser – Cop and Speeder
  • Circle – Meronia
  • Credit to the Nation – Take Dis
  • Kyuss – Welcome to Sky Valley
  • Drive Like Jehu – Yank Crime

I heard tracks from some of these bands on 6Music, but many were missing. Maybe there is something for you to investigate.

Part of a series obsessively looking at music in an obsessive manner.

Meeting in the Aisle

Lab meetings: love them or loathe them, they’re an important part of lab-life. There’s many different formats and ways to do a lab meeting. Sometimes it feels like we’ve tried them all! I’m going to describe our current format and then discuss some other things to try.

Our current lab meeting format is:

  • Weekly. For one hour (Wednesdays at 9am)
  • One person each week talks about their progress. It rotates around.
  • At the start, we talk about general lab issues.
  • Then, last week’s data presenter does a 5 minute, one slide Journal club on a paper of their choice.
  • We organise the rota and table any issues using our general lab Trello board.

Currently, we meet in one of the pods in our building. A pod is a sound-proofed booth that seats 8 people on two sofa style seats. It has a table and an additional 2 people can cram in if needed. Previously we used a meeting room, with the presenter stood at the front using PowerPoint with a projector. One week the meeting room was unavailable and so we used a pod instead. It is a lot more informal and the suggestions and discussions flowed as a result. So we have kept the meeting in the pod, using a laptop to present data.

In addition to this, each person in my lab meets with me for 30 min on a Monday morning to go through raw data and troubleshooting. They also present a more formal talk to the centre once every 6-9 months. I mention this to give some context. Our lab meetings are something between “my cloning hasn’t worked” and a polished presentation.

I’m happy with the current arrangement, but we’ve tried many alternatives. Here is a brief list of things you can consider.

 

Two presenters

In my opinion this is a bad idea. We went through a period of doing this so that lab presentations were more frequent, or because we were also doing journal clubs too (I forget which). What happens is that one person has a lot of data and gets lots of discussion and then we either run out of time or the other person feels bad if they don’t have as much stuff to talk about. Accidentally you have made unnecessary competition amongst lab members which is not good. Just go for one presenter. The presenter feels like it is their day to get as much as they can out of the meeting and then next week the focus will move to someone else.

Round-the-table

This is where you go round and people say what they have done since the last meeting. Depending on the size of the group, this probably takes 2 hours or “as long as it takes” which cuts further into the working day. If the meeting is too frequent, lab members can soon get into a groove of saying “nothing worked” each time and it’s difficult to keep track of who is struggling. Not only is it easy for people to hide, the meeting can also become dominated by someone with interesting data. The format also doesn’t develop any presentation/explanation skills. My preference is to keep the focus on one person.

Rotating data talk and journal clubs

It is really common, especially if you have a small group to do data presentation one week and then journal club the next week. My feelings on Journal Clubs are: if they are done properly, they can be really useful and constructive. Too often they regress into the complete trashing of a paper. As fun as this is, it doesn’t teach trainees the right skills. I’d love it if people in the lab were on top of the literature, but forcing people to delve deeply into one paper is not very effective in promoting this behaviour. I think that it’s more important to use the lab meeting time to go through lab data rather than talk about someone else’s work. Some labs have it set up where the presenter can pick data or paper, which means people who are struggling with their project can hide behind presenting papers. I’m not a fan. We currently do a 5-minute journal club to briefly cover a paper and say why they thought it was good. This takes up minimal time and people can read more deeply if they want. I got this tip from another lab. I recently heard of a lab who spend one meeting a month going through one paper per lab member. We might try this in the future. We also have a list on our General lab Trello board for suggesting cool papers that people think others should read.

Banning powerpoint, western films on the table

At some point I got fed up with seeing a full-on talk from lab members each week, with an introduction and summary (and even acknowledgements!). Partly because it was very repetitive, partly because it inhibited discussions and also I felt people were spending too much time preparing their talk. Moving to the pod (see above) kind of solved this naturally. In the past, we did a total back-to-basics: “PowerPoint is now banned bring your lab book and let’s see the raw data”. This was a good shock to the system. However, people started printing out diagrams… these were made in PowerPoint … and before I knew it, PowerPoint was back! Now, there is value in lab members giving a proper talk in lab meeting. Everyone needs to learn to do it and it can quickly get people used to presenting. Not everyone is great at it though and what lab members need from a lab meeting – I believe – is feedback on their project and injection of new ideas. A formal talk from someone struggling to do a good job or overcome with nervousness doesn’t help anyone. I prefer to keep things informal. Lots of interruptions, questions and enthusiasm from the audience.

Joint lab meetings

When my group was starting and I just had two people we joined in with another lab in their lab meetings.   This worked well until my group was too large to make it work well. What was good was that the other PI was more experienced and liked to do a “blood on the floor” style of lab meeting. This is not really my style, but we had a “good cop, bad cop” thing going on which was useful. For a while. If the lab ethos is too different it can cause friction and if the other PI has any bad habits, things can quickly unravel. There’s also issues around collaboration and projects overlapping which can make joint lab meetings difficult. So, this can be useful if you can find the right lab to partner with, but proceed with caution.

Themed lab meetings 

No, not turning up dressed as someone from The Rocky Horror Picture Show… In my lab we work in two different areas. For a few years we segregated the lab meetings by theme. This seemed like a great idea initially, but in the end I changed from this because I worried it set up an artificial divide. People from the other theme started to ask if they could work in the lab instead. There was also different numbers of people working on the two themes. I tried to rotate the presenters fairly, but there was resentment that people presented more often on one theme than the other.  I know some dual-PI labs who do this successfully, but they have far more people. This is not recommended for a regular one PI lab with less than 10 people. Anyway, most labs just work in one area anyway.

Skype and remote lab meetings

For about one year, we had a student join our lab meetings via skype. She was working at another university and it was important for her to be involved in these meetings. It worked OK and she could even present her data when it was her turn. We used the lab dropbox folder for sharing slides, papers and data with her. We still use this folder now for that purpose. I know PIs who skype in to lab meetings when they are away, so that the lab meeting always goes ahead at the same time each week. I have never done this and don’t think it would work for our lab.

Fun stuff – breaking the routine

OK. Depending on your definition of fun… to check on the state of people’s lab books. I ask lab members to bring along their lab books without warning to the lab meeting and then get them to swap with a random person and then ask them to explain what that person did in the lab on a random date. It gets the message across and also brings up issues people are having with recording their data. We also occasionally do fun stuff such as quizzes but tend to do these outside of the lab meeting. I’ve also used the lab meeting to teach people how to do things in a software package or some other demo. This breaks things up a bit and can freshen up the lab meeting routine. Something else to consider to keep it fun: a cookie schedule. We don’t have one, but people randomly bring in some food if they have been away somewhere or they have cooked a delicacy from their home country.

State of the lab address

Once a year, normally in January when no-one wants to do the first lab meeting of the New Year, I do a state of the lab address. I go through the goals and objectives of the lab. Things that I feel are going well, areas where we could have done better. Successes from last year. The aim is to set the scene for the year ahead.

People in the lab can get a bit deep into their project and having some kind of overview is actually really helpful for them (or so they tell me!). Invite them along if you are giving a seminar or use a lab meeting to try out a seminar you are going to give so that they can see the big picture.

Ideas session

It doesn’t happen often that a presenter has nothing to present. The gaps between presenters are long enough to ensure this doesn’t happen. However, sometimes it can be that the person scheduled to talk has just given a bigger talk to the whole centre (and I forgot to check). When this has happened, we have switched to a forward-looking lab meeting to plan out ideas. Again this can break up the routine.

Time

I think 1 hour is enough. Any longer and it can start to drag out. I try to make it every week. Occasionally it gets cancelled when my schedule doesn’t allow it. But if the schedule gets too ad hoc, it sends the wrong message to the lab members.

Wednesday morning works well for us, but we’ve tried Tuesday mornings, Wednesday afternoons etc. I’m happy to set this by the demands from experiments etc. For example, most people in my lab like to image cells Thursday and Friday so those days are off limits. I also ask that everyone comes on time, and try to lead by example. I know a lab where they instigated a 1 Euro fine for lateness, including the PI. This is used as a cookie fund.

No lab meeting at all!

During my PhD we never had a regular lab meeting. Well, I can remember a few occassions where we tried to get it going but it didn’t stick. In my postdoc lab we also similarly failed to do it regularly. I didn’t mind at the time and was happy to spend the time instead working in the lab. However, I can see that many issues in the lab would’ve probably been solved by regular meetings. So I’m pro-lab meeting.

And finally…

Maybe this should have been at the beginning… but what exactly is the point of a lab meeting?

Presenter – Feedback on their project, injection of new ideas, is this the right route to go down? etc. Improve presentation skills, explain their project to others can help understanding.

Other lab people – Update on the presenter’s project, a feeling for what is expected, ideas for their own project. Have your say and learn to ask questions constructively.

PI – Update on project, give feedback, oversee the tone and standard.

Everyone – lab cohesion, a chance to address issues around the lab, catch up on the latest papers and data.

If none of the above suggestions sound good to you, maybe think about what you are trying to get out of your lab meetings and design a format that helps you achieve this.

The post title is taken from Meeting in the Aisle by Radiohead, B-side on the Karma Police single.

Adventures in Code IV: correcting filenames

A large amount of time doing data analysis is the process of cleaning, importing, reorganising and generally not actually analysing data but getting it ready to analyse. I’ve been trying to get over the idea to non-coders in the group that strict naming conventions (for example) are important and very helpful to the poor person who has to deal with the data.

missingplot

Things have improved a lot and dtatsets that used to take a few hours to clean up are now pretty much straightforward. A recent example is shown here. Almost 200 subconditions are plotted out and there is only one missing graph. I suspect the blood sugar levels were getting low in the person generating the data… the cause was a hyphen in the filename and not an underscore.

These data are read into Igor from CSVs outputted from Imaris. Here comes the problem: the folder and all files within it have the incorrect name.

There are 35 files in each folder and clearly this needs a computer to fix, even if it were just one foldersworth at fault. The quickest way is to use the terminal and there are lots of ways to do it.

Now, as I said the problem is that the foldername and filenames both need correcting. Most terminal commands you can quickly find online actually fail because they try to rename the file and folder at the same time, and since the folder with the new name doesn’t exist… you get an error.

The solution is to rename the folders first and then the files.


find . -type d -maxdepth 2 -name "oldstring*" | while read FNAME; do mv "$FNAME" "${FNAME//oldstring/newstring}"; done
find . -type f -maxdepth 3 -name "oldstring*.csv" | while read FNAME; do mv "$FNAME" "${FNAME//oldstring/newstring}"; done

A simple tip, but effective and useful. HT this gist

Part of a series on computers and coding

Tips from the blog XI: Overleaf

I was recently an external examiner for a PhD viva in Cambridge. As we were wrapping up, I asked “if you were to do it all again, what would you do differently?”. It’s one of my stock questions and normally the candidate says “oh I’d do it so much quicker!” or something similar. However, this time I got a surprise. “I would write my thesis in LaTeX!”, was the reply.

As a recent convert to LaTeX I could see where she was coming from. The last couple of manuscripts I have written were done in Overleaf and have been a breeze. This post is my summary of the site.

overleaf-greygreen-410

I have written ~40 manuscripts and countless other documents using Microsoft Word for Mac, with EndNote as a reference manager (although I have had some failed attempts to break free of that). I’d tried and failed to start using TeX last year, motivated by seeing nicely formatted preprints appearing online. A few months ago I had a new manuscript to write with a significant mathematical modelling component and I realised that now was the chance to make the switch. Not least because my collaborator said “if we are going to write this paper in Word, I wouldn’t know where to start”.

screen-shot-2016-12-11-at-07-39-13I signed up for an Overleaf account. For those that don’t know, Overleaf is an online TeX writing tool on one half of the screen and a rendered version of your manuscript on the other. The learning curve is quite shallow if you are used to any kind of programming or markup. There are many examples on the site and finding out how to do stuff is quick thanks to LaTeX wikibooks and stackexchange.

Beyond the TeX, the experience of writing a manuscript in Overleaf is very similar to editing a blog post in WordPress.

Collaboration

The best thing about Overleaf is the ability to collaborate easily. You can send a link to a collaborator and then work on it together. Using Word in this way can be done with DropBox, but versioning and track changes often cause more problems than it’s worth and most people still email Word versions to each other, which is a nightmare. Overleaf changes this by having a simple interface that can be accessed by multiple people. I have never used Google docs for writing papers, but this does offer the same functionality.

All projects are private by default, but you can put your document up on the site if you want to. You might want to do this if you have developed an example document in a certain style.

screen-shot-2016-12-11-at-07-38-36

Versioning

Depending on the type of account you have, you can roll back changes. It is possible to ‘save’ versions, so if you get to a first draft and want to send it round for comment, you can save a version and then use this to go back to, if required. This is a handy insurance in case somebody comes in to edit the document and breaks something.

You can download a PDF at any point, or for that matter take all the files away as a zip. No more finalfinalpaper3final.docx…

If you’re keeping score, that’s Overleaf 2, Word nil.

Figures

Placing figures in the text is easy and all major formats are supported. What is particularly nice is that I can generate figures in an Igor layout and output directly to PDF and put that into Overleaf. In Word, the placement of figures can be fiddly. Everyone knows the sensation of moving a picture slightly and it disappears inexplicably onto another page. LaTeX will put the figure in where you want it or the next best place. It just works.

screen-shot-2016-12-11-at-07-44-33Equations

This is what LaTeX excels at. Microsoft Word has an equation editor which has varied over the years from terrible to just-about-usable. The current version actually uses elements of TeX (I think). The support for mathematical text in LaTeX is amazing, not surprising since this is the way that most papers in maths are written. Any biologist will find their needs met here.

Templates and formatting

There are lots of templates available on Overleaf and many more on the web. For example, there are nice PNAS and PLoS formats as well as others for theses and for CVs and other documents. The typesetting is beautiful. Setting out sections/subsections and table of contents is easy. To be fair to Word, if you know how to use it properly, this is easy too, but the problem is that most people don’t, and also styles can get messed up too easily.

Referencing

This works by adding a bibtex file to your project. You can do this with any reference manager. Because I have a huge EndNote database, I used this initially. Another manuscript I’ve been working on, my student started out with a Mendeley library and we’ve used that. It’s very flexible. Slightly more fiddly than with Word and EndNote. However, I’ve had so many problems (and crashes) with that combination over the years that any alternative is a relief.

Compiling

You can set the view on the right to compile automatically or you can force updates manually. Either way the document must compile. If you have made a mistake, it will complain and try to guess what you have done wrong and tell you. Errors that prevent the document from being compiled are red. Less serious errors are yellow and allow compilation to go ahead. This can be slow going at first, but I found that I was soon up to speed with editing.

Preamble

This is the name of the stuff at the header of a TeX document. You can add in all kinds of packages to cover proper usage of units (siunitx) or chemical notation (mhchem). They all have great documentation. All the basics, e.g. referencing, are included in Overleaf by default.

Offline

The entire concept of Overleaf is to work online. Otherwise you could just use TeXshop or some other program. But how about times when you don’t have internet access? I was concerned about this at the start, but I found that in practice, these days, times when you don’t have a connection are very few and far between. However, I was recently travelling and wanted to work on an Overleaf manuscript on the aeroplane. Of course, with Word, this is straightforward.

With Overleaf it is possible. You can do two things. The first is to download your files ahead of your period of internet outage. You can edit your main.tex document in an editor of your choice. The second option is more sophisticated. You can clone your project with git and then work on that local clone. The instructions of how to do that are here (the instructions, from 2015, say it’s in beta, but it’s fully working). You can work on your document locally and then push changes back to Overleaf when you have access once more.

Downsides

OK. Nothing is perfect and I noticed that typos and grammatical errors are more difficult for me to detect in Overleaf. I think this is because I am conditioned with years of Word use. The dictionary is smaller than in Word and it doesn’t try to correct your grammar like word does (although this is probably a good thing!). Maybe I should try the rich text view and see if that helps. I guess the other downside is that the other authors need to know TeX rather than Word. As described above if you are writing with a mathematician, this is not a problem. For biologists though this could be a challenge.

Back to the PhD exam

I actually think that writing a thesis is probably a once-in-a-lifetime chance to understand how Microsoft Word (and EndNote) really works. The candidate explained that she didn’t trust Word enough to do everything right, so her thesis was made of several different documents that were fudged to look like one long thesis. I don’t think this is that unusual. She explained that she had used Word because her supervisor could only use Word and she had wanted to take advantage of the Review tools. Her heart had sunk when her supervisor simply printed out drafts and commented using a red pen, meaning that she could have done it all in LaTeX and it would have been fine.

Conclusion

I have been totally won over by Overleaf. It beats Microsoft Word in so many ways… I’ll stick to Word for grant applications and other non-manuscript documents, but I’m going to keep using it for manuscripts, with the exception of papers written with people who will only use Word.

Elevation: accuracy of a Garmin Edge 800 GPS device

I use a Garmin 800 GPS device to log my cycling activity. including my commutes. Since I have now built up nearly 4 years of cycling the same route, I had a good dataset to look at how accurate the device is.

I wrote some code to import all of the rides tagged with commute in rubiTrack 4 Pro (technical details are below). These tracks needed categorising so that they could be compared. Then I plotted them out as a gizmo in Igor Pro and compared them to a reference data set which I obtained via GPS Visualiser.

commute3d

The reference dataset is black. Showing the “true” elevation at those particular latitude and longitude coordinates. Plotted on to that are the commute tracks coloured red-white-blue according to longitude. You can see that there are a range of elevations recorded by the device, apart from a few outliers they are mostly accurate but offset. This is strange because I have the elevation of the start and end points saved in the device and I thought it changed the altitude it was measuring to these elevation positions when recording the track, obviously not.

abcTo look at the error in the device I plotted out the difference in the measured altitude at a given location versus the true elevation. For each route (to and from work) a histogram of elevation differences is shown to the right. The average difference is 8 m for the commute in and 4 m for the commute back. This is quite a lot considering that all of this is only ~100 m above sea level. The standard deviation is 43 m for the commute in and 26 m for the way back.

cda

This post at VeloViewer comparing GPS data on Strava from pro-cyclists riding the St15 of 2015 Giro d’Italia sprang to mind. Some GPS devices performed OK, whereas others (including Garmin) did less well. The idea in that post is that rain affects the recording of some units. This could be true and although I live in a rainy country, I doubt it can account for the inaccuracies recorded here. Bear in mind that that stage was over some big changes in altitude and my recordings, very little. On the other hand, there are very few tracks in that post whereas there is lots of data here.

startmidIt’s interesting that the data is worse going in to work than coming back. I do set off quite early in the morning and it is colder etc first thing which might mean the unit doesn’t behave as well for the commute to work. Both to and from work tracks vary most in lat/lon recordings at the start of the track which suggests that the unit is slow to get an exact location – something every Garmin user can attest to. Although I always wait until it has a fix before setting off. The final two plots show what the beginning of the return from work looks like for location accuracy (travelling east to west) compared to a midway section of the same commute (right). This might mean the the inaccuracy at the start determines how inaccurate the track is. As I mentioned, the elevation is set for start and end points. Perhaps if the lat/lon is too far from the endpoint it fails to collect the correct elevation.

Conclusion

I’m disappointed with the accuracy of the device. However, I have no idea whether other GPS units (including phones) would outperform the Garmin Edge 800 or even if later Garmin models are better. This is a good but limited dataset. A similar analysis would be possible on a huge dataset (e.g. all strava data) which would reveal the best and worst GPS devices and/or the best conditions for recording the most accurate data.

Technical details

I described how to get GPX tracks from rubiTrack 4 Pro into Igor and how to crunch them in a previous post. I modified the code to get elevation data out from the cycling tracks and generally made the code slightly more robust. This left me with 1,200 tracks. My commutes are varied. I frequently go from A to C via B and from C to A via D which is a loop (this is what is shown here). But I also go A to C via D, C to A via B and then I also often extend the commute to include 30 km of Warwickshire countryside. The tracks could be categorized by testing whether they began at A or C (this rejected some partial routes) and then testing whether they passed through B or D. These could then be plotted and checked visually for any routes which went off course, there were none. The key here is to pick the right B and D points. To calculate the differences in elevation, the simplest thing was to get GPS Visualiser to tell me what the elevation should be for all the points I had. I was surprised that the API could do half a million points without complaining. This was sufficient to do the rest. Note that the comparisons needed to be done as lat/lon versus elevation because due to differences in speed, time or trackpoint number lead to inherent differences in lat/lon (and elevation). Note also due to the small scale I didn’t bother converting lat/lon into flat earth kilometres.

The post title comes from “Elevation” by Television, which can be found on the classic “Marquee Moon” LP.

Reaching Out

Outreach means trying to engage the public with what we are doing in our research group. For me, this mainly means talking to non-specialists about our work and showing them around the lab. These non-specialists are typically interested members of the public and mainly supporters of the charity that funds work in my lab (Cancer Research UK). The most recent batch of activities have prompted this post on doing outreach.

The challenge

Outreach is challenging. Taking part in these events made me realise what a tough job it is to do science communication, and how good the best the communicators are.

There are many ways that an outreach talk is tougher to give than a research seminar. Not least because explaining what we do in the lab can quickly spiral down into a full-on Cell Biology 101 lecture.

A statement like “we work on process x and we are studying a protein called y”, needs to be followed by “jobs in cells are done by proteins”, then maybe “proteins are encoded by genes”, in our DNA, which is a bunch of letters, oh there’s mRNA, ahhh stop! Pretty soon, it can get too confusing for the audience. In a seminar, the level of knowledge is already there, so protein x can be mentioned without worrying about why or how it got there.

On the other hand, giving an outreach talk is much easier than giving a seminar because the audience is already warm to you and they don’t want you to stuff it up. It’s a bit like giving a speech at a wedding.

The challenge is exciting because it means that our work needs to be explained plainly and placed in a bigger context. If you get the chance to explain your work to a lay audience, I recommend you try.

Disarming questions

The big difference between doing a scientific talk for scientists and talking to non-specialists is in the questions. They can be disarming, for various reasons. Here are a few that I have had on recent visits. How would you answer?

Can you tell the difference [down the microscope] between cells from a black person versus those from a white person!?

For context, we had just looked at some HeLa cells down the microscope and I had explained a little bit about Henrietta Lacks and the ethical issues surrounding this cell line.

You mentioned evolution but I think you’ll find that the human cell is just too intricate. How do you think cells are really made?

Hint: it doesn’t matter what you reply. You will be unlikely to change their mind.

Do you dream of being famous? What will be your big discovery?

I’ve also been asked “are we close to a cure for cancer?”. It’s important to temper people’s enthusiasm here I think.

Are you anything to do with [The Crick]? No? Good! It’s a waste of money and it shouldn’t have been built in London!

I had wondered if lay people knew about The Crick, which is now the biggest research institute in the UK. Clearly they have! I tried to explain that The Crick is a chance to merge several institutes that already existed in London and so it would save money on running these places.

Aren’t you just being exploited by the pharmaceutical industry?

This person was concerned that academics generate knowledge which is then commercialised by companies.

My friend took a herbal remedy and it cured his cancer. Why aren’t you working on that?

Like the question rejecting evolution, it is difficult for people to abandon their N-of-one/anecdotal knowledge.

Does X cause cancer?

This is a problem of the media in our country I think. Who seem to be on a mission to categorise everything (red meat, wine, tin foil) into either cancer-causing or cancer-preventing.

As you can see, the questions are wide-ranging, which is unsettling in itself. It’s very different to “have you tried mutating serine 552 to test if the effect is one of general negative charge on the protein?” that you get in a research seminar.

The charity that organises some of the events I’ve been involved in are really supportive and give a list of good ways to answer “typical questions”. However, most questions I get are atypical, and the anticipated questions about animal research or embryo cloning do not arise.

I find it difficult to give a succinct answer to these lay questions. I try to give an accurate reply, but this leads to  long and complicated answer that probably confuses the person even more. I have the same problem with children’s questions, which often get me scurrying to Wikipedia to find the exact answer for “why the sky is blue”. I should learn to just give a vaguely correct answer and not worry about the details so much.

Amazing questions

The best questions are those where you can tell that the person has really got into it. In the last talk I gave, I described “stop” and “go” signals for cell division. One person asked

How does a cell suddenly know that it has to divide? It must get a signal from somewhere… what is that signal?

My initial reply was that asking these sorts of questions is what doing science is all about!

Two more amazing questions:

Is it true that scientists are secretive with their results and think more about advancing their careers than publicising their findings openly to give us value for money?

This was from a supporter of the charity who had read a piece in The Guardian about scientific publishing. She followed up by asking why do scientists put their research behind paywalls. I found this tough to answer because I suddenly felt responsible for the behaviour of the entire scientific community.

You mentioned taxol and the side effects. I was taking that for my breast cancer and it is true what you said. It was very painful and I had to stop treatment.

This was the first time a patient had talked to me about their experience of things that were actually in my talk. This was a stark reminder that the research I am doing is not as abstract as I think. It also made me more cautious about the way I talk about current treatments, since people in the room may be actually taking them!

Good support

With the charity I’ve been to Polo Clubs, hotels, country houses, Bishop’s houses, relay events in public parks. The best part is welcoming people to our lab. These might be a Mayor or people connected wth the city football team, but mainly they are interested supporters of the charity. It’s nice to be able to explain where their money goes and what a life in cancer research is really like.

To do these events, there is a team of people doing all the organisation: inviting participants, sorting out parking, tea and coffee etc. The team are super-enthusiastic and they are really skilled at talking to the public. The events could not go ahead without them. So, a big thank you to them. I’ve also been helped by the folks in the lab and colleagues in my building who have helped to show visitors around and let them see cells down the microscope etc.

Give it a try

Of course there are many other ways to engage the public in our research. This is just focussed on talking to non-scientists and the issues that arise. As I’ve tried to outline here, it’s a fun challenge. If you get the opportunity to do this, give it a try.

The post title comes from “Reaching Out” by Matthew Sweet from his Altered Beast LP. Lovely use of diminished seventh in a pop song and of course the drums are by none other than Mick Fleetwood.