Adventures in Code IV: correcting filenames

A large amount of time doing data analysis is the process of cleaning, importing, reorganising and generally not actually analysing data but getting it ready to analyse. I’ve been trying to get over the idea to non-coders in the group that strict naming conventions (for example) are important and very helpful to the poor person who has to deal with the data.

missingplot

Things have improved a lot and dtatsets that used to take a few hours to clean up are now pretty much straightforward. A recent example is shown here. Almost 200 subconditions are plotted out and there is only one missing graph. I suspect the blood sugar levels were getting low in the person generating the data… the cause was a hyphen in the filename and not an underscore.

These data are read into Igor from CSVs outputted from Imaris. Here comes the problem: the folder and all files within it have the incorrect name.

There are 35 files in each folder and clearly this needs a computer to fix, even if it were just one foldersworth at fault. The quickest way is to use the terminal and there are lots of ways to do it.

Now, as I said the problem is that the foldername and filenames both need correcting. Most terminal commands you can quickly find online actually fail because they try to rename the file and folder at the same time, and since the folder with the new name doesn’t exist… you get an error.

The solution is to rename the folders first and then the files.


find . -type d -maxdepth 2 -name "oldstring*" | while read FNAME; do mv "$FNAME" "${FNAME//oldstring/newstring}"; done
find . -type f -maxdepth 3 -name "oldstring*.csv" | while read FNAME; do mv "$FNAME" "${FNAME//oldstring/newstring}"; done

A simple tip, but effective and useful. HT this gist

Part of a series on computers and coding

Tips from the blog XI: Overleaf

I was recently an external examiner for a PhD viva in Cambridge. As we were wrapping up, I asked “if you were to do it all again, what would you do differently?”. It’s one of my stock questions and normally the candidate says “oh I’d do it so much quicker!” or something similar. However, this time I got a surprise. “I would write my thesis in LaTeX!”, was the reply.

As a recent convert to LaTeX I could see where she was coming from. The last couple of manuscripts I have written were done in Overleaf and have been a breeze. This post is my summary of the site.

overleaf-greygreen-410

I have written ~40 manuscripts and countless other documents using Microsoft Word for Mac, with EndNote as a reference manager (although I have had some failed attempts to break free of that). I’d tried and failed to start using TeX last year, motivated by seeing nicely formatted preprints appearing online. A few months ago I had a new manuscript to write with a significant mathematical modelling component and I realised that now was the chance to make the switch. Not least because my collaborator said “if we are going to write this paper in Word, I wouldn’t know where to start”.

screen-shot-2016-12-11-at-07-39-13I signed up for an Overleaf account. For those that don’t know, Overleaf is an online TeX writing tool on one half of the screen and a rendered version of your manuscript on the other. The learning curve is quite shallow if you are used to any kind of programming or markup. There are many examples on the site and finding out how to do stuff is quick thanks to LaTeX wikibooks and stackexchange.

Beyond the TeX, the experience of writing a manuscript in Overleaf is very similar to editing a blog post in WordPress.

Collaboration

The best thing about Overleaf is the ability to collaborate easily. You can send a link to a collaborator and then work on it together. Using Word in this way can be done with DropBox, but versioning and track changes often cause more problems than it’s worth and most people still email Word versions to each other, which is a nightmare. Overleaf changes this by having a simple interface that can be accessed by multiple people. I have never used Google docs for writing papers, but this does offer the same functionality.

All projects are private by default, but you can put your document up on the site if you want to. You might want to do this if you have developed an example document in a certain style.

screen-shot-2016-12-11-at-07-38-36

Versioning

Depending on the type of account you have, you can roll back changes. It is possible to ‘save’ versions, so if you get to a first draft and want to send it round for comment, you can save a version and then use this to go back to, if required. This is a handy insurance in case somebody comes in to edit the document and breaks something.

You can download a PDF at any point, or for that matter take all the files away as a zip. No more finalfinalpaper3final.docx…

If you’re keeping score, that’s Overleaf 2, Word nil.

Figures

Placing figures in the text is easy and all major formats are supported. What is particularly nice is that I can generate figures in an Igor layout and output directly to PDF and put that into Overleaf. In Word, the placement of figures can be fiddly. Everyone knows the sensation of moving a picture slightly and it disappears inexplicably onto another page. LaTeX will put the figure in where you want it or the next best place. It just works.

screen-shot-2016-12-11-at-07-44-33Equations

This is what LaTeX excels at. Microsoft Word has an equation editor which has varied over the years from terrible to just-about-usable. The current version actually uses elements of TeX (I think). The support for mathematical text in LaTeX is amazing, not surprising since this is the way that most papers in maths are written. Any biologist will find their needs met here.

Templates and formatting

There are lots of templates available on Overleaf and many more on the web. For example, there are nice PNAS and PLoS formats as well as others for theses and for CVs and other documents. The typesetting is beautiful. Setting out sections/subsections and table of contents is easy. To be fair to Word, if you know how to use it properly, this is easy too, but the problem is that most people don’t, and also styles can get messed up too easily.

Referencing

This works by adding a bibtex file to your project. You can do this with any reference manager. Because I have a huge EndNote database, I used this initially. Another manuscript I’ve been working on, my student started out with a Mendeley library and we’ve used that. It’s very flexible. Slightly more fiddly than with Word and EndNote. However, I’ve had so many problems (and crashes) with that combination over the years that any alternative is a relief.

Compiling

You can set the view on the right to compile automatically or you can force updates manually. Either way the document must compile. If you have made a mistake, it will complain and try to guess what you have done wrong and tell you. Errors that prevent the document from being compiled are red. Less serious errors are yellow and allow compilation to go ahead. This can be slow going at first, but I found that I was soon up to speed with editing.

Preamble

This is the name of the stuff at the header of a TeX document. You can add in all kinds of packages to cover proper usage of units (siunitx) or chemical notation (mhchem). They all have great documentation. All the basics, e.g. referencing, are included in Overleaf by default.

Offline

The entire concept of Overleaf is to work online. Otherwise you could just use TeXshop or some other program. But how about times when you don’t have internet access? I was concerned about this at the start, but I found that in practice, these days, times when you don’t have a connection are very few and far between. However, I was recently travelling and wanted to work on an Overleaf manuscript on the aeroplane. Of course, with Word, this is straightforward.

With Overleaf it is possible. You can do two things. The first is to download your files ahead of your period of internet outage. You can edit your main.tex document in an editor of your choice. The second option is more sophisticated. You can clone your project with git and then work on that local clone. The instructions of how to do that are here (the instructions, from 2015, say it’s in beta, but it’s fully working). You can work on your document locally and then push changes back to Overleaf when you have access once more.

Downsides

OK. Nothing is perfect and I noticed that typos and grammatical errors are more difficult for me to detect in Overleaf. I think this is because I am conditioned with years of Word use. The dictionary is smaller than in Word and it doesn’t try to correct your grammar like word does (although this is probably a good thing!). Maybe I should try the rich text view and see if that helps. I guess the other downside is that the other authors need to know TeX rather than Word. As described above if you are writing with a mathematician, this is not a problem. For biologists though this could be a challenge.

Back to the PhD exam

I actually think that writing a thesis is probably a once-in-a-lifetime chance to understand how Microsoft Word (and EndNote) really works. The candidate explained that she didn’t trust Word enough to do everything right, so her thesis was made of several different documents that were fudged to look like one long thesis. I don’t think this is that unusual. She explained that she had used Word because her supervisor could only use Word and she had wanted to take advantage of the Review tools. Her heart had sunk when her supervisor simply printed out drafts and commented using a red pen, meaning that she could have done it all in LaTeX and it would have been fine.

Conclusion

I have been totally won over by Overleaf. It beats Microsoft Word in so many ways… I’ll stick to Word for grant applications and other non-manuscript documents, but I’m going to keep using it for manuscripts, with the exception of papers written with people who will only use Word.

Elevation: accuracy of a Garmin Edge 800 GPS device

I use a Garmin 800 GPS device to log my cycling activity. including my commutes. Since I have now built up nearly 4 years of cycling the same route, I had a good dataset to look at how accurate the device is.

I wrote some code to import all of the rides tagged with commute in rubiTrack 4 Pro (technical details are below). These tracks needed categorising so that they could be compared. Then I plotted them out as a gizmo in Igor Pro and compared them to a reference data set which I obtained via GPS Visualiser.

commute3d

The reference dataset is black. Showing the “true” elevation at those particular latitude and longitude coordinates. Plotted on to that are the commute tracks coloured red-white-blue according to longitude. You can see that there are a range of elevations recorded by the device, apart from a few outliers they are mostly accurate but offset. This is strange because I have the elevation of the start and end points saved in the device and I thought it changed the altitude it was measuring to these elevation positions when recording the track, obviously not.

abcTo look at the error in the device I plotted out the difference in the measured altitude at a given location versus the true elevation. For each route (to and from work) a histogram of elevation differences is shown to the right. The average difference is 8 m for the commute in and 4 m for the commute back. This is quite a lot considering that all of this is only ~100 m above sea level. The standard deviation is 43 m for the commute in and 26 m for the way back.

cda

This post at VeloViewer comparing GPS data on Strava from pro-cyclists riding the St15 of 2015 Giro d’Italia sprang to mind. Some GPS devices performed OK, whereas others (including Garmin) did less well. The idea in that post is that rain affects the recording of some units. This could be true and although I live in a rainy country, I doubt it can account for the inaccuracies recorded here. Bear in mind that that stage was over some big changes in altitude and my recordings, very little. On the other hand, there are very few tracks in that post whereas there is lots of data here.

startmidIt’s interesting that the data is worse going in to work than coming back. I do set off quite early in the morning and it is colder etc first thing which might mean the unit doesn’t behave as well for the commute to work. Both to and from work tracks vary most in lat/lon recordings at the start of the track which suggests that the unit is slow to get an exact location – something every Garmin user can attest to. Although I always wait until it has a fix before setting off. The final two plots show what the beginning of the return from work looks like for location accuracy (travelling east to west) compared to a midway section of the same commute (right). This might mean the the inaccuracy at the start determines how inaccurate the track is. As I mentioned, the elevation is set for start and end points. Perhaps if the lat/lon is too far from the endpoint it fails to collect the correct elevation.

Conclusion

I’m disappointed with the accuracy of the device. However, I have no idea whether other GPS units (including phones) would outperform the Garmin Edge 800 or even if later Garmin models are better. This is a good but limited dataset. A similar analysis would be possible on a huge dataset (e.g. all strava data) which would reveal the best and worst GPS devices and/or the best conditions for recording the most accurate data.

Technical details

I described how to get GPX tracks from rubiTrack 4 Pro into Igor and how to crunch them in a previous post. I modified the code to get elevation data out from the cycling tracks and generally made the code slightly more robust. This left me with 1,200 tracks. My commutes are varied. I frequently go from A to C via B and from C to A via D which is a loop (this is what is shown here). But I also go A to C via D, C to A via B and then I also often extend the commute to include 30 km of Warwickshire countryside. The tracks could be categorized by testing whether they began at A or C (this rejected some partial routes) and then testing whether they passed through B or D. These could then be plotted and checked visually for any routes which went off course, there were none. The key here is to pick the right B and D points. To calculate the differences in elevation, the simplest thing was to get GPS Visualiser to tell me what the elevation should be for all the points I had. I was surprised that the API could do half a million points without complaining. This was sufficient to do the rest. Note that the comparisons needed to be done as lat/lon versus elevation because due to differences in speed, time or trackpoint number lead to inherent differences in lat/lon (and elevation). Note also due to the small scale I didn’t bother converting lat/lon into flat earth kilometres.

The post title comes from “Elevation” by Television, which can be found on the classic “Marquee Moon” LP.

Reaching Out

Outreach means trying to engage the public with what we are doing in our research group. For me, this mainly means talking to non-specialists about our work and showing them around the lab. These non-specialists are typically interested members of the public and mainly supporters of the charity that funds work in my lab (Cancer Research UK). The most recent batch of activities have prompted this post on doing outreach.

The challenge

Outreach is challenging. Taking part in these events made me realise what a tough job it is to do science communication, and how good the best the communicators are.

There are many ways that an outreach talk is tougher to give than a research seminar. Not least because explaining what we do in the lab can quickly spiral down into a full-on Cell Biology 101 lecture.

A statement like “we work on process x and we are studying a protein called y”, needs to be followed by “jobs in cells are done by proteins”, then maybe “proteins are encoded by genes”, in our DNA, which is a bunch of letters, oh there’s mRNA, ahhh stop! Pretty soon, it can get too confusing for the audience. In a seminar, the level of knowledge is already there, so protein x can be mentioned without worrying about why or how it got there.

On the other hand, giving an outreach talk is much easier than giving a seminar because the audience is already warm to you and they don’t want you to stuff it up. It’s a bit like giving a speech at a wedding.

The challenge is exciting because it means that our work needs to be explained plainly and placed in a bigger context. If you get the chance to explain your work to a lay audience, I recommend you try.

Disarming questions

The big difference between doing a scientific talk for scientists and talking to non-specialists is in the questions. They can be disarming, for various reasons. Here are a few that I have had on recent visits. How would you answer?

Can you tell the difference [down the microscope] between cells from a black person versus those from a white person!?

For context, we had just looked at some HeLa cells down the microscope and I had explained a little bit about Henrietta Lacks and the ethical issues surrounding this cell line.

You mentioned evolution but I think you’ll find that the human cell is just too intricate. How do you think cells are really made?

Hint: it doesn’t matter what you reply. You will be unlikely to change their mind.

Do you dream of being famous? What will be your big discovery?

I’ve also been asked “are we close to a cure for cancer?”. It’s important to temper people’s enthusiasm here I think.

Are you anything to do with [The Crick]? No? Good! It’s a waste of money and it shouldn’t have been built in London!

I had wondered if lay people knew about The Crick, which is now the biggest research institute in the UK. Clearly they have! I tried to explain that The Crick is a chance to merge several institutes that already existed in London and so it would save money on running these places.

Aren’t you just being exploited by the pharmaceutical industry?

This person was concerned that academics generate knowledge which is then commercialised by companies.

My friend took a herbal remedy and it cured his cancer. Why aren’t you working on that?

Like the question rejecting evolution, it is difficult for people to abandon their N-of-one/anecdotal knowledge.

Does X cause cancer?

This is a problem of the media in our country I think. Who seem to be on a mission to categorise everything (red meat, wine, tin foil) into either cancer-causing or cancer-preventing.

As you can see, the questions are wide-ranging, which is unsettling in itself. It’s very different to “have you tried mutating serine 552 to test if the effect is one of general negative charge on the protein?” that you get in a research seminar.

The charity that organises some of the events I’ve been involved in are really supportive and give a list of good ways to answer “typical questions”. However, most questions I get are atypical, and the anticipated questions about animal research or embryo cloning do not arise.

I find it difficult to give a succinct answer to these lay questions. I try to give an accurate reply, but this leads to  long and complicated answer that probably confuses the person even more. I have the same problem with children’s questions, which often get me scurrying to Wikipedia to find the exact answer for “why the sky is blue”. I should learn to just give a vaguely correct answer and not worry about the details so much.

Amazing questions

The best questions are those where you can tell that the person has really got into it. In the last talk I gave, I described “stop” and “go” signals for cell division. One person asked

How does a cell suddenly know that it has to divide? It must get a signal from somewhere… what is that signal?

My initial reply was that asking these sorts of questions is what doing science is all about!

Two more amazing questions:

Is it true that scientists are secretive with their results and think more about advancing their careers than publicising their findings openly to give us value for money?

This was from a supporter of the charity who had read a piece in The Guardian about scientific publishing. She followed up by asking why do scientists put their research behind paywalls. I found this tough to answer because I suddenly felt responsible for the behaviour of the entire scientific community.

You mentioned taxol and the side effects. I was taking that for my breast cancer and it is true what you said. It was very painful and I had to stop treatment.

This was the first time a patient had talked to me about their experience of things that were actually in my talk. This was a stark reminder that the research I am doing is not as abstract as I think. It also made me more cautious about the way I talk about current treatments, since people in the room may be actually taking them!

Good support

With the charity I’ve been to Polo Clubs, hotels, country houses, Bishop’s houses, relay events in public parks. The best part is welcoming people to our lab. These might be a Mayor or people connected wth the city football team, but mainly they are interested supporters of the charity. It’s nice to be able to explain where their money goes and what a life in cancer research is really like.

To do these events, there is a team of people doing all the organisation: inviting participants, sorting out parking, tea and coffee etc. The team are super-enthusiastic and they are really skilled at talking to the public. The events could not go ahead without them. So, a big thank you to them. I’ve also been helped by the folks in the lab and colleagues in my building who have helped to show visitors around and let them see cells down the microscope etc.

Give it a try

Of course there are many other ways to engage the public in our research. This is just focussed on talking to non-scientists and the issues that arise. As I’ve tried to outline here, it’s a fun challenge. If you get the opportunity to do this, give it a try.

The post title comes from “Reaching Out” by Matthew Sweet from his Altered Beast LP. Lovely use of diminished seventh in a pop song and of course the drums are by none other than Mick Fleetwood.

Colours Running Out: Analysis of 2016 running

Towards the end of 2015, I started distance running. I thought it’d be fun to look at the frequency of my runs over the course of 2016.

Most of my runs were recorded with a GPS watch. I log my cycling data using Rubitrack, so I just added my running data to this. This software is great but to do any serious number crunching, other software is needed. Yes, I know that if I used strava I can do lots of things with my data… but I don’t. I also know that there are tools for R to do this, but I wrote something in Igor instead. The GitHub repo is here. There’s a technical description below, as well as some random thoughts on running (and cycling).

The animation shows the tracks I recorded as 2016 rolled by. The routes won’t mean much to you, but I can recognise most of them. You can see how I built up the distance to run a marathon and then how the runs became less frequent through late summer to October. I logged 975 km with probably another 50 km or so not logged.

run2016

Technical description

To pull the data out of rubiTrack 4 Pro is actually quite difficult since there is no automated export. An applescript did the job of going through all the run activities and exporting them as gpx. There is an API provided by Garmin to take the data straight from the FIT files recorded by the watch, but everything is saved and tagged in rubiTrack, so gpx is a good starting point. GPX is an xml format which can be read into Igor using XMLutils XOP written by andyfaff. Previously, I’ve used nokogiri for reading XML, but this XOP keeps everything within Igor. This worked OK, but I had some trouble with namespaces which I didn’t resolve properly and what is in the code is a slight hack. I wrote some code which imported all the files and then processed the time frame I wanted to look at. It basically looks at a.m. and p.m. for each day in the timeframe. Igor deals with date/time nicely and so this was quite easy. Two lookups per day were needed because I often went for two runs per day (run commuting). I set the lat/lon at the start of each track as 0,0. I used the new alpha tools in IP7 to fade the tracks so that they decay away over time. They disappear with 1/8 reduction in opacity over a four day period. Igor writes out to mov which worked really nicely, but wordpress can’t host movies, so I added a line to write out TIFFs of each frame of the animation and assembled a nice gif using FIJI.

Getting started with running

Getting into running was almost accidental. I am a committed cyclist and had always been of the opinion: since running doesn’t improve aerobic cycling performance (only cycling does that), any activity other than cycling is a waste of time. However, I realised that finding time for cycling was getting more difficult and also my goal is to keep fit and not to actually be a pro-cyclist, so running had to be worth a try. Roughly speaking, running is about three times more time efficient compared to cycling. One hour of running approximates to three hours of cycling. I thought, I would just try it. Over the winter. No more than that. Of course, I soon got the running bug and ran through most of 2016. Taking part in a few running events (marathon, half marathons, 10K). A quick four notes on my experience.

  1. The key thing to keeping running is staying healthy and uninjured. That means building up distance and frequency of running very slowly. In fact, the limitation to running is the body’s ability to actually do the distance. In cycling this is different, as long as you fuel adequately and you’re reasonably fit, you could cycle all day if you wanted. This not true of running, and so, building up to doing longer distances is essential and the ramp up shouldn’t be rushed. Injuries will cost you lost weeks on a training schedule.
  2. There’s lots of things “people don’t tell you” about running. Blisters and things everyone knows about, but losing a toenail during a 20 km run? Encountering runner’s GI problems? There’s lots of surprises as you start out. Joining a club or reading running forums probably helps (I didn’t bother!). In case you are wondering, the respective answers are getting decent shoes fitted and well, there is no cure.
  3. Going from cycling to running meant going from very little upper body mass to gaining extra muscle. This means gaining weight. This is something of a shock to a cyclist and seems counterintuitive, since more activity should really equate to weight loss. I maintained cycling through the year, but was not expecting a gain of ~3 kilos.
  4. As with any sport, having something to aim for is essential. Training for training’s sake can become pointless, so line up something to shoot for. Sign up for an event or at least have an achievement (distance, average speed) in your mind that you want to achieve.

So there you have it. I’ll probably continue to mix running with cycling in 2017. I’ll probably extend the repo to do more with cycling data if I have the time.

The post title is taken from “Colours Running Out” by TOY from their eponymous LP.

Adventures in Code III: the quantixed ImageJ Update site

We have some macros for ImageJ/FIJI for making figures and blind analysis which could be useful to others.

I made an ImageJ Update Site so that the latest versions can be pushed out to the people in the lab, but this also gives the opportunity to share our code with the world. Feel free to add the quantixed ImageJ update site to your ImageJ or FIJI installation. Details of how to do that are here.

The code is maintained in this GitHub repo, which has a walkthrough for figure-making in the README. So, if you’d like to make figures the quantixed way, adding ROIs and zooms, then feel free to give this code a try. Please raise any issues there or get in touch some other way.

Disclaimer: this code is under development. I offer no guarantees to its usefulness. I am not responsible for data loss or injury that may result from its use!

Update @ 10:35 2016-12-20 I should point out that other projects already exist to make figures (MagicMontage, FigureJScientiFig). These projects are fine but they didn’t do what I wanted, so I made my own.

Come To California

I’ve returned from the American Society for Cell Biology 2016 meeting in San Francisco. Despite being a cell biologist and people from my lab attending this meeting numerous times, this was my first ASCB meeting.

cell-biology-2016

The conference was amazing, so much excellent science and so many opportunities to meet up with people. For the areas that I work in: mitosis, cytoskeleton and membrane traffic, the meeting was pretty much made for me. Often there were two or more sessions I could have attended, but couldn’t. I’ll try to summarise some of my highlights.

One of the best talks I saw was from Dick McIntosh, who is a legend of cell biology and is still making outstanding contributions. He showed some new tomography data of growing microtubules in a number of systems which suggest that microtubules have curved protofilaments as they grow. This is in agreement with structural data and some models of MT growth, but not with many other schematic diagrams.

The “bottom-up cell biology” subgroup was one of the first I attended. Organised by Dan Fletcher and Matt Good, the theme was reconstitution of biological systems in vitro. The mix of speakers was great, with Thomas Surrey and Marileen Dogterom giving great talks on microtubule systems, and Jim Hurley and Patricia Bassereau representing membrane curvature reconstitution. Physical principles and quantitative approaches were a strong theme here and throughout the meeting, which reflects where cell biology is at right now.

img_3382

I took part in a subgroup on preprints organised by Prachee Avasthi and Jessica Polka. I will try to write a separate post about this soon. This was a fun session that was also a chance to meet up with many people I had only met virtually. There was a lot of excitement about preprints at the meeting and it seemed like many attendees were aware of preprinting. I guess this is not too surprising since the ASCB have been involved with the Accelerating Science and Publishing in Biology (ASAPbio) group since the start.

Of the super huge talks I saw in the big room, the Cellular Communities session really stood out. Bonnie Bassler and Jurgen Knoblich gave fantastic talks on bacterial quorum sensing and “minibrains” respectively. The Porter Lecture, given by Eva Nogales on microtubule structure was another highlight.

The poster sessions (which I heard were sprawling and indigestible) were actually my favourite part of the meeting. I saw mostly new work here and had the chance to talk to quite a few presenters. My lab took three posters of different projects at various stages of publication (Laura’s work preprinted/in revision project presented by me, Nick’s work soon to submit and Gabrielle’s work soon to write up) and so we were all happy to get some useful feedback on our work. We’ve had follow up emails and requests for collaboration which made the long trip worthwhile. We also had a mini lab reunion with Dan Booth one of my former students who was presenting his work on using 3D Correlative Light Electron Microscopy to examine chromosome structure.

For those that follow me on Twitter, you may know that I like to make playlists from my iTunes library when I visit another city. This was my first time back on the west coast since 2001. Here are ten tracks selected from my San Francisco, CA playlist:

10. California Über Alles – Dead Kennedys from Fresh Fruit For Rotting Vegetables

9. San Franciscan Nights – The Animals from Winds of Change

8. Who Needs the Peace Corps? – The Mothers of Invention from We’re Only In It For The Money

7. San Francisco – Brian Wilson and Van Dyke Parks from Orange Crate Art

6. Going to California – Led Zeppelin from IV

5. Fake Tales of San Francisco – Arctic Monkeys from Whatever People Say I Am, That’s What I’m Not

4. California Hills – Ty Segall from Emotional Mugger

3. The Portland Cement Factory at Monolith California – Cul de Sac from ECIM (OK Monolith is nearer to LA than SF but it’s a great instrumental track).

2. Come to California – Matthew Sweet from Blue Sky on Mars

1. Russian Hill – Jellyfish from Spilt Milk

Before the meeting, I went on a long walk around SF with the guys from the lab and we accidentally found ourselves on Russian Hill.

img_3334

For some reason I have a higher than average number of bootlegs recorded in SF. Television (Old Waldorf 1978), Elliott Smith (Bottom of the Hill, 1998), Jellyfish (Warfield Theater 1993), My Bloody Valentine, Jimi Hendrix etc. etc.

The post title comes from #2 in my playlist

 

Tips from the blog X: multi-line commenting in Igor

This is part-tip, part-adventures in code. I found out recently that it is possible to comment out multiple lines of code in Igor and thought I’d put this tip up here.

Multi-line commenting in programming is useful two reasons:

  1. writing comments (instructions, guidance) that last more than one line
  2. the ability to temporarily remove a block of code while testing

In each computer language there is the ability to comment out at least one line of code.

In Igor this is “//”, which comments out the whole line, but no more.

ipcomment1

This is the same as in ImageJ macro language.

ijcomment1

Now, to comment out whole sections in FIJI/ImageJ is easy. Inserting “/*” where you want the comment to start, and then “*/” where it ends, multiple lines later.

ijcomment2

I didn’t think this syntax was available in Igor, and it isn’t really. I was manually adding “//” for each line I wanted to remove, which was annoying. It turns out that you can use Edit > Commentize to add “//” to the start of all selected lines. The keyboard shortcut in IP7 is Cmd-/. You can reverse the process with Edit > Decommentize or Cmd-\.

ipcomment2

There is actually another way. Igor can conditionally compile code. This is useful if for example you write for Igor 7 and Igor 6. You can get compilation of IP7 commands only if the user is running IP7 for example. This same logic can be used to comment out code as follows.

ipcomment3

The condition if 0 is never satisfied, so the code does not compile. The equivalent statement for IP7-specific compilation, is “#if igorversion()>=7”.

So there you have it, two ways to comment out code in Igor. These tips were from IgorExchange.

If you want to read more about commenting in different languages and the origins of comments, read here.

This post is part of a series of tips.

Bateman Writes: Eye of the Tiger

I don’t often write about music at quantixed but I recently caught Survivor’s “Eye of The Tiger” on the radio and thought it deserved a quick post.

Surely everyone knows this song: a kind of catchall motivational tune. It is loved by people in gyms with beach-unready bodies and by presidential hopefuls without permission to use it.

Written specifically for Rocky III after Sylvester Stallone was refused permission by Queen to use “Another One Bites The Dust”, it has that 1980s middle-of-the-road hard-rock-but-not-heavy-metal feel to it. The kind of track that must be filed under “guilty pleasure”. Possibly you love this song. Maybe you get ready to meet your opponents whilst listening to it? If this is you, please don’t read on.

I find it difficult listening to this track because of the timing of the intro. Not sure what I mean?

Here is a waveform of one channel for the intro. Two of the opening phrases are shown underlined. A phrase in this case is: dun, dun-dun-dun, dun-dun-dun, dun-dun-durrrr. Can you see the problem with the second of those two phrases?

tiger1

Still don’t see it? In the second phrase the second of the dun-dun-duns comes in late.

I’ve overlaid the waveform again to compare phrase 1 with phrase 2.

tiger2

The difference is one-eighth (quaver) and it drives me nuts. I think it’s intentional because, well the whole band play the same thing. I don’t think it’s a tape splice error, because the track sounds live and surely someone must have noticed. Finally, they play these phrases again in the outro and that point the timing is correct. No, it’s intentional. Why?

From this page Jim Peterik of Survivor says:

I started doing that now-famous dead string guitar riff and started slashing those chords to the punches we saw on the screen, and the whole song took shape in the next three days.

So my best guess is that the notes were written to match the on-screen action!

The video on YouTube is only at 220 million views (at the time of writing). Give it a listen, if my description of dun-dun-dun’s was not illustrative enough for you.

Notes:

  • The waveform is taken from the Eye of The Tiger album version of the song. I read that the version in the movie is actually the demo version.
  • I loaded it into Igor using SoundLoadWave. I made an average of the stereo channels using MatrixOp and then downsampled the wave from 44.1 kHz so it was easier to move around.

A very occasional series on music. The name Bateman Writes, refers to the obsessive writings of the character Patrick Bateman in Bret Easton Ellis’s novel American Psycho. This serial killer had a penchant for middle of the road rock act Huey Lewis & The News.

Blind To The Truth

Molecular Biology of The Cell, the official journal of the American Society for Cell Biology, recently joined a number of other periodicals in issuing guidelines for manuscripts, concerning statistics and reproducibility. I discussed these guidelines with the lab and we felt that there are two areas where we can improve:

  • blind analysis
  • power calculations

A post about power analysis is brewing, this post is about a solution for blind analysis.

For anyone that doesn’t know, blind analysis refers to: the person doing the analysis being blind to (not knowing) the experimental conditions. It is a way of reducing bias, intentional or otherwise, of analysis of experimental data. Most of our analysis workflows are blinded, because a computer does the analysis in an automated way so there is no way of a human biasing the result. Typically, a bunch of movies are captured, fed into a program using identical settings, and the answer gets spat out. Nothing gets excluded, or the exclusion criteria are agreed beforehand. Whether the human operating the computer is blind or not to the experimental conditions doesn’t matter.

For analysis that has a manual component we do not normally blind the analyser. Instead we look for ways to make the analysis free of bias. An example is using a non-experimental channel in the microscope image to locate a cellular structure. This means the analysis is done without “seeing” any kind of “result”, which might bias the analysis.

Sometimes, we do analysis which is almost completely manual and this is where we can improve by using blinding. Two objections raised to blinding are practical ones:

  • it is difficult/slow to get someone else to do the analysis of your data (we’ve tried it and it doesn’t work well!)
  • the analyser “knows” the result anyway, in the case of conditions where there is a strong effect

There’s not much we can do about the second one. But the solution to the first is to enable people to blindly analyse their own data if it is needed.

I wrote* a macro in ImageJ called BlindAnalysis.ijm which renames the files in a random fashion** and makes tsv log of the associations. The analyser can simply analyse blind_0001.tif, blind_0002.tif and then reassociate the results to the real files using this tsv.

blindanalysis

The picture shows the macro in action. A folder containing 10 TIFFs is processed into a new folder called BLIND. The files are stripped of labels (look at the original TIFF, left and the blind version, right) and saved with blinded names. The log file keeps track of associations.

I hope this simple macro is useful to others. Feedback welcome either on this post or on GitHub.

* actually, I found an old macro on the web called Shuffler written by Christophe Leterrier. This needed a bit of editing and also had several options that weren’t needed, so it was more deleting than writing.

** it also strips out the label of the file. If you only rename files, the label remains in ImageJ so the analyser is not blind to the condition. Originally I was working on a bash script to do the renaming, but when I realised that I needed to strip out the labels, I needed to find an all-ImageJ solution.

Edit @ 2016-10-11T06:05:48.422Z I have updated the macro with the help of some useful suggestions.

The post title is taken from “Blind To The Truth” a 22 second-long track from Napalm Death’s 2nd LP ‘From Enslavement To Obliteration.