bioRxiv, the preprint server for biology, recently turned 2 years old. This seems a good point to take a look at how bioRxiv has developed over this time and to discuss any concerns sceptical people may have about using the service.
Firstly, thanks to Richard Sever (@cshperspectives) for posting the data below. The first plot shows the number of new preprints deposited and the number that were revised, per month since bioRxiv opened in Nov 2013. There are now about 200 preprints being deposited per month and this number will continue to increase. The cumulative article count (of new preprints) shows that, as of the end of last month, there are >2500 preprints deposited at bioRxiv.
What is take up like across biology? To look at this, the number of articles in different subject categories can be totted up. Evolutionary Biology, Bioinformatics and Genomics/Genetics are the front-running disciplines. Obviously counting articles should be corrected for the size of these fields, but it’s clear that some large disciplines have not adopted preprinting in the same way. Cell biology, my own field, has some catching up to do. It’s likely that this reflects cultures within different fields. For example, genomics has a rich history of data deposition, sharing and openness. Other fields, less so…
So what are we waiting for?
I’d recommend that people wondering about preprinting go and read Stephen Curry’s post “just do it“. Any people who remain sceptical should keep reading…
Do I really want to deposit my best work on bioRxiv?
I’ve picked six preprints that were deposited in 2015. This selection demonstrates how important work is appearing first at bioRxiv and is being downloaded thousands of times before the papers appear in the pages of scientific journals.
- Accelerating scientific publishing in biology. A preprint about preprinting from Ron Vale, subsequently published in PNAS.
- Analysis of protein-coding genetic variation in 60,706 humans. A preprint summarising a huge effort from ExAC Exome Aggregation Consortium. 12,366 views, 4,534 downloads.
- TP53 copy number expansion correlates with the evolution of increased body size and an enhanced DNA damage response in elephants. This preprint was all over the news, e.g. Science.
- Sampling the conformational space of the catalytic subunit of human γ-secretase. CryoEM is the hottest technique in biology right now. Sjors Scheres’ group have been at the forefront of this revolution. This paper is now out in eLife.
- The genome of the tardigrade Hypsibius dujardini. The recent controversy over horizontal gene transfer in Tardigrades was rapidfire thanks to preprinting.
- CRISPR with independent transgenes is a safe and robust alternative to autonomous gene drives in basic research. This preprint concerning biosafety of CRISPR/Cas technology could be accessed immediately thanks to preprinting.
But many journals consider preprints to be previous publications!
Wrong. It is true that some journals have yet to change their policy, but the majority – including Nature, Cell and Science – are happy to consider manuscripts that have been preprinted. There are many examples of biology preprints that went on to be published in Nature (ancient genomes) and Science (hotspots in birds). If you are worried about whether the journal you want to submit your work to will allow preprinting, check this page first or the SHERPA/RoMEO resource. The journal “information to authors” page should have a statement about this, but you can always ask the Editor.
I’m going to get scooped
Preprints establish priority. It isn’t possible to be scooped if you deposit a preprint that is time-stamped showing that you were the first. The alternative is to send it to a journal where no record will exist that you submitted it if the paper is rejected, or sometimes even if they end up publishing it (see discussion here). Personally, I feel that the fear of scooping in science is overblown. In fields that are so hot that papers are coming out really fast the fear of scooping is high, everyone sees the work if its on bioRxiv or elsewhere – who was first is clear to all. Think of it this way: depositing a preprint at bioRxiv is just the same as giving a talk at a meeting. Preprints mean that there is a verifiable record available to everyone.
Preprints look ugly, I don’t want people to see my paper like that.
The depositor can format their preprint however they like! Check out Christophe Leterrier’s beautifully formatted preprint, or this one from Dennis Eckmeier. Both authors made their templates available so you can follow their example (1 and 2).
Yes but does -insert name of famous scientist- deposit preprints?
Lots of high profile scientists have already used bioRxiv. David Bartel, Ewan Birney, George Church, Ray Deshaies, Jennifer Doudna, Steve Henikoff, Rudy Jaenisch, Sophien Kamoun, Eric Karsenti, Maria Leptin, Rong Li, Andrew Murray, Pam Silver, Bruce Stillman, Leslie Vosshall and many more. Some sceptical people may find this argument compelling.
I know how publishing works now and I don’t want to disrupt the status quo
It’s paradoxical how science is all about pushing the frontiers, yet when it comes to publishing, scientists are incredibly conservative. Physics and Mathematics have been using preprinting as part of the standard route to publication for decades and so adoption by biology is nothing unusual and actually, we will simply be catching up. One vision for the future of scientific publishing is that we will deposit preprints and then journals will search out the best work from the server to highlight in their pages. The journals that will do this are called “overlay journals”. Sounds crazy? It’s already happening in Mathematics. Terry Tao, a Fields medal-winning mathematician recently deposited a solution to the Erdos discrepency problem on arXiv (he actually put them on his blog first). This was then “published” in Discrete Analysis, an overlay journal. Read about this here.
Disclaimer: other preprint services are available. F1000 Research, PeerJ Preprints and of course arXiv itself has quantitative biology section. My lab have deposited work at bioRxiv (1, 2 and 3) and I am an affiliate for the service, which means I check preprints before they go online.
Edit 14/12/15 07:13 put the scientists in alphabetical order. Added a part about scooping.
The post title comes from the term “white label” which is used for promotional vinyl copies of records ahead of their official release.
We have a new paper out! You can access it here.
The work was mainly done by Cristina Gutiérrez Caballero, a post-doc in the lab. We had some help from Selena Burgess and Richard Bayliss at the University of Leicester, with whom we have an ongoing collaboration.
The paper in a nutshell
We found that TACC3 binds the plus-ends of microtubules via an interaction with ch-TOG. So TACC3 is a +TIP.
What is a +TIP?
This is a term used to describe proteins that bind to the plus-ends of microtubules. Microtubules are a major component of the cell’s cytoskeleton. They are polymers of alpha/beta-tubulin that grow and shrink, a feature known as dynamic instability. A microtubule has polarity, the fast growing end is known as the plus-end, and the slower growing end is referred to as the minus-end. There are many proteins that bind to the plus-end and these are termed +TIPs.
OK, so what are TACC3 and ch-TOG?
They are two proteins found on the mitotic spindle. TACC3 is an acronym for transforming acidic coiled-coil protein 3, and ch-TOG stands for colonic hepatic tumour overexpressed gene. As you can tell from the names they were discovered due to their altered expression in certain human cancers. TACC3 is a well-known substrate for Aurora A kinase, which is an enzyme that is often amplified in cancer. The ch-TOG protein is thought to be a microtubule polymerase, i.e. an enzyme that helps microtubules grow. In the paper, we describe how TACC3 and ch-TOG stick together at the microtubule end. TACC3 and ch-TOG are at the very end of the microtubule, they move ahead of other +TIPs like “end-binding proteins”, e.g. EB3.
What is the function of TACC3 as a +TIP?
We think that TACC3 is piggybacking on ch-TOG while it is acting as a polymerase, but any biological function or consequence of this piggybacking was difficult to detect. We couldn’t see any clear effect on microtubule dynamics when we removed or overexpressed TACC3. We did find that loss of TACC3 affects how cells migrate, but this is not likely to be due to a change in microtubule dynamics.
I thought TACC3 and ch-TOG were centrosomal proteins…
In the paper we look again at this and find that there are different pools of TACC3, ch-TOG and clathrin (alone and in combination) and describe how they reside in different places in the cell. Although ch-TOG is clearly at centrosomes, we don’t find TACC3 at centrosomes, although it is on microtubules that cluster near the centrosomes at the spindle pole. TACC3 is often described as a centrosomal protein in lots of other papers, but this is quite misleading.
We were on the cover – whatever that means in the digital age! We imaged a cell expressing tagged EB3 proteins, EB3 is another +TIP. We coloured consecutive frames different colours and the result looked pretty striking. Biology Open picked it as their cover, which we were really pleased about. Our paper is AOP at the moment and so hopefully they won’t change their mind by the time it appears in the next issue.
This is the second paper that we have deposited as a preprint at bioRxiv (not counting a third paper that we preprinted after it was accepted). I was keen to preprint this particular paper because we became aware that two other groups had similar results following a meeting last summer. Strangely, a week or so after preprinting and submitting to a journal, a paper from a completely different group appeared with a very similar finding! We’d been “scooped”. They had found that the Xenopus homologue of TACC3 was a +TIP in retinal neuronal cultures. The other group had clearly beaten us to it, having submitted their paper some time before our preprint. The reviewers of our paper complained that our data was no longer novel and our paper was rejected. This was annoying because there were lots of novel findings in our paper that weren’t in theirs (and vice versa). The reviewers did make some other constructive suggestions that we incorporated into the manuscript. We updated our preprint and then submitted to Biology Open. One advantage of the preprinting process is that the changes we made can be seen by all. Biology Open were great and took a decision based on our comments from the other journal and the changes we had made in response to them. Their decision to provisionally accept the paper was made in four days. Like our last experience publishing in Biology Open, it was very positive.
Gutiérrez-Caballero, C., Burgess, S.G., Bayliss, R. & Royle, S.J. (2015) TACC3-ch-TOG track the growing tips of microtubules independently of clathrin and Aurora-A phosphorylation. Biol. Open doi:10.1242/bio.201410843.
Nwagbara, B. U., Faris, A. E., Bearce, E. A., Erdogan, B., Ebbert, P. T., Evans, M. F., Rutherford, E. L., Enzenbacher, T. B. and Lowery, L. A. (2014) TACC3 is a microtubule plus end-tracking protein that promotes axon elongation and also regulates microtubule plus end dynamics in multiple embryonic cell types. Mol. Biol. Cell 25, 3350-3362.
The post title is taken from the last track on The Orb’s U.F.Orb album.
I noticed something strange about the 2013 Impact Factor data for eLife.
Before I get onto the problem. I feel I need to point out that I dislike Impact Factors and think that their influence on science is corrosive. I am a DORA signatory and I try to uphold those principles. I admit that, in the past, I used to check the new Impact Factors when they were released, but no longer. This year, when the 2013 Impact Factors came out I didn’t bother to log on to take a look. A chance Twitter conversation with Manuel Théry (@ManuelTHERY) and Christophe Leterrier (@christlet) was my first encounter with the new numbers.
Huh? eLife has an Impact Factor?
For those that don’t know, the 2013 Impact Factor is worked out by counting the total number of 2013 cites to articles in a given journal that were published in 2011 and 2012. This number is divided by the number of “citable items” in that journal in 2011 and 2012.
Now, eLife launched in October 2012. So it seems unfair that it gets an Impact Factor since it only published papers for 12.5% of the window under scrutiny. Is this normal?
I looked up the 2013 Impact Factor for Biology Open, a Company of Biologists journal that launched in January 2012* and… it doesn’t have one! So why does eLife get an Impact Factor but Biology Open doesn’t?**
Looking at the numbers for eLife revealed that there were 230 citations in 2013 to eLife papers in 2011 and 2012. One of which was a mis-citation to an article in 2011. This article does not exist (the next column shows that there were no articles in 2011). My guess is that Thomson Reuters view this as the journal existing for 2011 and 2012, and therefore deserving of an Impact Factor. Presumably there are no mis-cites in the Biology Open record and it will only get an Impact Factor next year. Doesn’t this call into question the veracity of the database? I have found other errors in records previously (see here). I also find it difficult to believe that no-one checked this particular record given the profile of eLife.
Perhaps unsurprisingly, I couldn’t track down the rogue citation. I did look at the cites to eLife articles from all years in Web of Science, the Thomson Reuters database (which again showed that eLife only started publishing in Oct 2012). As described before there are spurious citations in the database. Josh Kaplan’s eLife paper on UNC13/Tomosyn managed to rack up 5 citations in 2004, some 9 years before it was published (in 2013)! This was along with nine other papers that somehow managed to be cited in 2004 before they were published. It’s concerning enough that these data are used for hiring, firing and funding decisions, but if the data are incomplete or incorrect this is even worse.
Summary: I’m sure the Impact Factor of eLife will rise as soon as it has a full window for measurement. This would actually be 2016 when the 2015 Impact Factors are released. The journal has made it clear in past editorials (and here) that it is not interested in an Impact Factor and won’t promote one if it is awarded. So, this issue makes no difference to the journal. I guess the moral of the story is: don’t take the Impact Factor at face value. But then we all knew that already. Didn’t we?
* For clarity, I should declare that we have published papers in eLife and Biology Open this year.
** The only other reason I can think of is that eLife was listed on PubMed right away, while Biology Open had to wait. This caused some controversy at the time. I can’t see why a PubMed listing should affect Impact Factor. Anyhow, I noticed that Biology Open got listed in PubMed by October 2012, so in the end it is comparable to eLife.
Edit: There is an update to this post here.
Edit 2: This post is the most popular on Quantixed. A screenshot of visitors’ search engine queries (Nov 2014)…
The post title is taken from “Strange Things” from Big Black’s Atomizer LP released in 1986.
We have a new paper out! You can read it here.
I thought I would write a post on how this paper came to be and also about our first proper experience with preprinting.
Title of the paper: Non-specificity of Pitstop 2 in clathrin-mediated endocytosis.
In a nutshell: we show that Pitstop 2, a supposedly selective clathrin inhibitor acts in a non-specific way to inhibit endocytosis.
Background: The description of “pitstops” – small molecules that inhibit clathrin-mediated endocytosis – back in 2011 in Cell was heralded as a major step-forward in cell biology. And it really would be a breakthrough if we had ways to selectively switch off clathrin-mediated endocytosis. Lots of nasty things gain entry into cells by hijacking this pathway, including viruses such as HIV and so if we could stop viral entry this could prevent cellular infection. Plus, these reagents would be really handy in the lab for cell biologists.
The rationale for designing the pitstop inhibitors was that they should block the interaction between clathrin and adaptor proteins. Adaptors are the proteins that recognise the membrane and cargo to be internalised – clathrin itself cannot do this. So if we can stop clathrin from binding adaptors there should be no internalisation – job done! Now, in 2000 or so, we thought that clathrin binds to adaptors via a single site on its N-terminal domain. This information was used in the drug screen that identified pitstops. The problem is that, since 2000, we have found that there are four sites on the N-terminal domain of clathrin that can each mediate endocytosis. So blocking one of these sites with a drug, would do nothing. Despite this, pitstop compounds, which were shown to have a selectivity for one site on the N-terminal domain of clathrin, blocked endocytosis. People in the field scratched their hands at how this is possible.
A damning paper was published in 2012 from Julie Donaldson’s lab showing that pitstops inhibit clathrin-independent endocytosis as well as clathrin-mediated endocytosis. Apparently, the compounds affect the plasma membrane and so all internalisation is inhibited. Many people thought this was the last that we would hear about these compounds. After all, these drugs need to be highly selective to be any use in the lab let alone in the clinic.
Our work: we had our own negative results using these compounds, sitting on our server, unpublished. Back in February 2011, while the Pitstop paper was under revision, the authors of that study sent some of these compounds to us in the hope that we could use these compounds to study clathrin on the mitotic spindle. The drugs did not affect clathrin binding to the spindle (although they probably should have done) and this prompted us to check whether the compounds were working – they had been shipped all the way from Australia so maybe something had gone wrong. We tested for inhibition of clathrin-mediated endocytosis and they worked really well.
At the time we were testing the function of each of the four interaction sites on clathrin in endocytosis, so we added Pitstop 2 to our experiments to test for specificity. We found that Pitstop 2 inhibits clathrin-mediated endocytosis even when the site where Pitstops are supposed to bind, has been mutated! The picture shows that the compound (pink) binds where sequences from adaptors can bind. Mutation of this site doesn’t affect endocytosis, because clathrin can use any three of the other four sites. Yet Pitstop blocks endocytosis mediated by this mutant, so it must act elsewhere, non-specifically.
So the compounds were not as specific as claimed, but what could we do with this information? There didn’t seem enough to publish and I didn’t want people in the lab working on this as it would take time and energy away from other projects. Especially when debunking other people’s work is such a thankless task (why this is the case, is for another post). The Dutta & Donaldson paper then came out, which was far more extensive than our results and so we moved on.
A few things prompted me to write this work up. Not least, Yasmina had since shown that our mutations were sufficient to prevent AP-2 binding to clathrin. This result filled a hole in our work. These things were:
- People continuing to use pitstops in published work, without acknowledging that they may act non-specifically. The turning point was this paper, which was critical of the Dutta & Donaldson work.
- People outside of the field using these compounds without realising their drawbacks.
- AbCam selling this compound and the thought of other scientists buying it and using it on the basis of the original paper made me feel very guilty that we had not published our findings.
- It kept getting easier and easier to publish “negative results”. Journals such as Biology Open from Company of Biologists or PLoS ONE and preprint servers (see below) make this very easy.
Finally, it was a twitter conversation with Jim Woodgett convinced me that, when I had the time, I would write it up.
We have our own results on the non-specificity of pitstops – if only there was a good/easy way to publish it and get the data out there.
— Steve Royle (@clathrin) October 17, 2013
To which, he replied:
— Jim Woodgett (@jwoodgett) October 17, 2013
I added an acknowledgement to him in our paper! So that, together with the launch of bioRxiv, convinced me to get the paper online.
The Preprinting Experience
This paper was our first proper preprint. We had put an accepted version of our eLife paper on bioRxiv before it came out in print at eLife, but that doesn’t really count. For full disclosure, I am an affiliate of bioRxiv.
The preprint went up on 13th February and we submitted it straight to Biology Open the next day. I had to check with the Journal that it was OK to submit a deposited paper. At the time they didn’t have a preprint policy (although I knew that David Stephens had submitted his preprinted paper there and he told me their policy was about to change). Biology Open now accept preprinted papers – you can check which journals do and which ones don’t here.
My idea was that I just wanted to get the information into the public domain as fast as possible. The upshot was, I wasn’t so bothered about getting feedback on the manuscript. For those that don’t know: the idea is that you deposit your paper, get feedback, improve your paper then submit it for publication. In the end I did get some feedback via email (not on the bioRxiv comments section), and I was able to incorporate those changes into the revised version. I think next time, I’ll deposit the paper and wait one week while soliciting comments and then submit to a journal.
It was viewed quite a few times in the time while the paper was being considered by Biology Open. I spoke to a PI who told me that they had found the paper and stopped using pitstop as a result. I think this means getting the work out there was worth it after all.
Now it is out “properly” in Biology Open and anyone can read it.
Verdict: I was really impressed by Biology Open. The reviewing and editorial work were handled very fast. I guess it helps that the paper was very short, but it was very uncomplicated. I wanted to publish with Biology Open rather than PLoS ONE as the Company of Biologists support cell biology in the UK. Disclaimer: I am on the committee of the British Society of Cell Biology which receives funding from CoB.
Depositing the preprint at bioRxiv was easy and for this type of paper, it is a no-brainer. I’m still not sure to what extent we will preprint our work in the future. This is unchartered territory that is evolving all the time, we’ll see. I can say that the experience for this paper was 100% positive.
Dutta, D., Williamson, C. D., Cole, N. B. and Donaldson, J. G. (2012) Pitstop 2 is a potent inhibitor of clathrin-independent endocytosis. PLoS One 7, e45799.
Lemmon, S. K. and Traub, L. M. (2012) Getting in Touch with the Clathrin Terminal Domain. Traffic, 13, 511-9.
Stahlschmidt, W., Robertson, M. J., Robinson, P. J., McCluskey, A. and Haucke, V. (2014) Clathrin terminal domain-ligand interactions regulate sorting of mannose 6-phosphate receptors mediated by AP-1 and GGA adaptors. J Biol Chem. 289, 4906-18.
von Kleist, L., Stahlschmidt, W., Bulut, H., Gromova, K., Puchkov, D., Robertson, M. J., MacGregor, K. A., Tomilin, N., Pechstein, A., Chau, N. et al. (2011) Role of the clathrin terminal domain in regulating coated pit dynamics revealed by small molecule inhibition. Cell 146, 471-84.
Willox, A.K., Sahraoui, Y.M.E. & Royle, S.J. (2014) Non-specificity of Pitstop 2 in clathrin-mediated endocytosis Biol Open, doi: 10.1242/bio.20147955.
Willox, A.K., Sahraoui, Y.M.E. & Royle, S.J. (2014) Non-specificity of Pitstop 2 in clathrin-mediated endocytosis bioRxiv, doi: 10.1101/002675.
The post title is taken from ‘Into The Great Wide Open’ by Tom Petty and The Heartbreakers from the LP of the same name.