Molecular Biology of The Cell, the official journal of the American Society for Cell Biology, recently joined a number of other periodicals in issuing guidelines for manuscripts, concerning statistics and reproducibility. I discussed these guidelines with the lab and we felt that there are two areas where we can improve:
- blind analysis
- power calculations
A post about power analysis is brewing, this post is about a solution for blind analysis.
For anyone that doesn’t know, blind analysis refers to: the person doing the analysis being blind to (not knowing) the experimental conditions. It is a way of reducing bias, intentional or otherwise, of analysis of experimental data. Most of our analysis workflows are blinded, because a computer does the analysis in an automated way so there is no way of a human biasing the result. Typically, a bunch of movies are captured, fed into a program using identical settings, and the answer gets spat out. Nothing gets excluded, or the exclusion criteria are agreed beforehand. Whether the human operating the computer is blind or not to the experimental conditions doesn’t matter.
For analysis that has a manual component we do not normally blind the analyser. Instead we look for ways to make the analysis free of bias. An example is using a non-experimental channel in the microscope image to locate a cellular structure. This means the analysis is done without “seeing” any kind of “result”, which might bias the analysis.
Sometimes, we do analysis which is almost completely manual and this is where we can improve by using blinding. Two objections raised to blinding are practical ones:
- it is difficult/slow to get someone else to do the analysis of your data (we’ve tried it and it doesn’t work well!)
- the analyser “knows” the result anyway, in the case of conditions where there is a strong effect
There’s not much we can do about the second one. But the solution to the first is to enable people to blindly analyse their own data if it is needed.
I wrote* a macro in ImageJ called BlindAnalysis.ijm which renames the files in a random fashion** and makes tsv log of the associations. The analyser can simply analyse blind_0001.tif, blind_0002.tif and then reassociate the results to the real files using this tsv.
The picture shows the macro in action. A folder containing 10 TIFFs is processed into a new folder called BLIND. The files are stripped of labels (look at the original TIFF, left and the blind version, right) and saved with blinded names. The log file keeps track of associations.
I hope this simple macro is useful to others. Feedback welcome either on this post or on GitHub.
* actually, I found an old macro on the web called Shuffler written by Christophe Leterrier. This needed a bit of editing and also had several options that weren’t needed, so it was more deleting than writing.
** it also strips out the label of the file. If you only rename files, the label remains in ImageJ so the analyser is not blind to the condition. Originally I was working on a bash script to do the renaming, but when I realised that I needed to strip out the labels, I needed to find an all-ImageJ solution.
Edit @ 2016-10-11T06:05:48.422Z I have updated the macro with the help of some useful suggestions.
The post title is taken from “Blind To The Truth” a 22 second-long track from Napalm Death’s 2nd LP ‘From Enslavement To Obliteration.