Last week, we kicked off a series interviewing some of the top ‘open scientists’ by interviewing Dr. Joanne Kamens of Addgene, and had a look at some of the great work she’d been doing in promoting a culture of data sharing, and equal opportunity for researchers. Today, we’ve got something completely different, with Daniel Shanahan of BioMed Central who recently published a really cool PeerJ paper on auto-correlation and the impact factor.
Hi Daniel! To start things off, can you tell us a bit about your background?
I completed a Master’s degree in Experimental and Theoretical Physics at University of Cambridge, but must admit I did my Master’s more to have an extra year to play rugby for the university, rather than a love of micro-colloidal particles and electron lasers. I have always loved science though and found my way into STM publishing, albeit from a slightly less than traditional route.
Open science is a rapidly evolving field, with a huge diversity of actors involved. We want to highlight some of the superstars helping to spearhead the evolution of scholarly communication, who are real positive forces for change. The first of these is with Joanne Kamens PhD, who currently is the Executive Director for Addgene, a repository for the life sciences. We asked her about open science, the impact this can have on diversity in research, and the value of repositories. Here’s her story!
Hi Joanne! So can you tell us a little bit about your background to get things started?
After graduating University of Pennsylvania I went directly to graduate school in the Harvard Medical School Division of Medical Sciences where I received a PhD in genetics. For you historians, it was the first year that the Division existed allowing students to move around PIs in many departments. I defended my thesis while 6 months pregnant and had my son while still working in that lab. I had a great mentor in Dr. Roger Brent (now at the Fred Hutchinson Center in Seattle). I studied transcription using yeast and helped demonstrate that an acidic domain of the Rel protein was activating when brought in proximity to the promoter region. Again for historical perspective, PCR was invented while I was in grad school and I got to beta test the first MJ research PCR machine (M worked on my floor) which had no outsides. Roger Brent’s lab was one of the labs that created the yeast two-hybrid screening system and I have always been a lover of molecular biology technology which serves me well at Addgene.
Doing peer review is tough. Building a Collection is tough. Both are also time consuming, and academics are like the White Rabbit from Alice in Wonderland: never enough time!
So while the benefits of open peer review and building Collection need to be considered in the ‘temporal trade off’ world of research, what are some other things researchers can do to help advance open science with us?
Here’s a simple list of 10 things that take anything from a few seconds to a few minutes!
Rate an article. You don’t have to do a full peer review, but can simply provide a rating. Come back later and provide a full review!
Recommend an article. Click, done. Interested researchers can see which articles are more highly recommended by the community.
Share an article. Use social media? Share on Facebook, Twitter, Google+, email, or further on ScienceOpen.
Comment on an article. Members with one item in their ORCID accounts can comment on any article.
Follow a Collection. See a Collection you like (like this?) Click, ‘Follow’, done.
Comment on a Collection. Like with all our articles, all Collection articles can be commented on, shared, recommended and peer reviewed.
Become a ScienceOpen member. It’s not needed for many of the functions on our platform, but does mean you can engage with the existing community and content more. Register here!
Have you replicated someone’s results? Let them know that in a comment!
Think someone’s methods are really great? Let them know in a comment!
Did someone not cite your work when they should have? Let them know in a comment!
All articles can be commented on. All you need to have is a membership, and an ORCID account with just one item. Easy! Commenting can be as short and sweet or long as you like. But sometimes a comment can be worth a lot of researchers and communities, just in terms of offering new thoughts, perspectives, or validation. Also, comments are great ways for junior researchers to engage with existing research communities.
We have new Collections coming out of our ears here at ScienceOpen! Last week, we saw two published on the bacterium Shewanella, and another on the Communication Through Coherence theory. Both should represent great platforms and resources for further research in those fields.
The latest is on the diverse field of Atomic Force Microscopy. We asked the Editor, Prof. Yang Gan, to give us a few details about why he created this Collection.
This collection is to celebrate the 30th anniversary of atomic force microscopy (AFM). March 3, 1986 saw publication of the land-marking paper “Atomic force microscope” by G. Binnig, C. G. Quate and C. Gerber (Phys Rev Lett, 56 (1986) 930-933, citations >8,800) with the motivation to invent “a new type of microscope capable of investigating surfaces of insulators on an atomic scale” with high force and dimension resolution. This can be used to measure local properties, such as height, friction, and magnetism, so has massive implications for science.
Since then, AFM has given birth to a large family of scanning probe microscopy (SPM) or SXM where X stands for near-field optical, Kelvin, magnetic, acoustic, thermal, etc. More than 100,000 journal papers, ~6,000 papers/yr since 2008, have been published if one searches the Scopus database with “atomic force microscopy” or “force microscope”. On ScienceOpen, there are over 6,000 article records if one searches using the keywords “atomic force microscopy” too. Nowadays, many disciplines — physics, chemistry, biology, materials, minerals, medicine, geology, nanotechnology, etc — all benefit greatly from using AFM as an important and even key tool for characterization, fabrication and processing.
The aim of this partnership is to standardise and integrated information that is currently distributed throughout more than 230 systems and databases in Germany. By adopting ORCID, this will support German universities and research institutes in implementing ORCID in a co-ordinated and sustainable approach.
“Thanks to the financial support from the Deutsche Forschungsgemeinschaft we have now the opportunity to promote the use of ORCID in Germany. This is a strong signal for ORCID in Germany,” says Roland Bertelmann, head of the Library and Information Services at the German Research Centre for Geoscience (GFZ).
ORCID is a critical part of research infrastructure, acting as a unique identifier for researchers, and a sort of LinkedIn style profile with your published research, and educational and professional histories embedded, and partnered with tools such as CrossRef/Scopus to make content integration easy and automated.
The Zika virus is an international public health emergency, as declared early on in February by the World Health Organisation. As such, it is critical that the global research community help combat this threat as rapidly and efficiently as possible. This is a case when science can quite literally save lives.
Recently, an article on the host-vector ratio in the Zika virus was published on the arXiv, a platform for articles often called ‘preprints’. This means that the work has not yet been peer reviewed, and is also not available to comment on the arXiv itself due to functional constraints. The paper is stuck in the hidden, timeless limbo of peer review until its eventual emergence as a paper or ultimate rejection.
ScienceOpen Collections are thematic groups of research articles that transcend journals and publishers to transform how we collate and build upon scientific knowledge.
What are Collections
The modern research environment is a hyper-dimensional space with a vast quantity of outputs that are impossible to manually manage. You can think of research like a giant Rubik’s cube: you have different ‘colours’ of research that you have to mix and match and play around with to discover how the different sections fit together to become something useful.
We view Collections as the individual faces of a Rubik’s cube. They draw from the vast, and often messy, pool of published research to provide an additional layer of context and clarity. They represent a new way for researchers to filter the published record to discover and curate content that is directly relevant to them, irrespective of who published it or what journal it appears in.
Advantages of Collections
Perhaps the main advantage of Collections to researchers is that they are independent of journals or publishers and their branding criteria. Researchers are undoubtedly the best-placed to assess what research is relevant to themselves and their communities. As such, we see Collections as the natural continuing transformation of the concept of the modern journal, acting in almost full cycle to return them to their basic principles.
The advantage of using Collections is that they provide researchers with the power to filter and select from the published record and create what is in essence a highly-specialised virtual journal. This means that Collections are not pre-selective, but instead comprise papers discriminated only by a single criterion: research that is relevant to your peers, and also deemed relevant by them.
Filtering for Collections occurs at different levels depending on scope or complexity of research. For example, Collections can be designed to focus on different research topics, lab groups or research groups, communities, or even departments or institutions. Collections can also be created for specific conferences and include posters from these, published on ScienceOpen. Youdefine the scope and the selection criteria.
Eugene Garfield, one of the founders of biliometrics and scientometrics, once claimed that “Citation indexes resolve semantic problems associated with traditional subject indexes by using citation symbology rather than words to describe the content of a document.” This statement led to the advent and a new dawn of Web-based measurements of citations, implemented as a way to describe the academic re-use of research.
However, Garfield had only reached a partial solution to a problem about measuring re-use, as one of the major problems with citation counts is that they are primarily contextless: they don’t tell us anything about why research is being re-used. Nonetheless, citation counts are now at the very heart of academic systems for two main reasons:
They are fundamental for grant, hiring and tenure decisions.
They form the core of how we currently assess academic impact and prestige.
Working out article-level citation counts is actually pretty complicated though, and depends on where you’re sourcing your information from. If you read the last blog post here, you’ll have seen that search results between Google Scholar, Web of Science, PubMed, and Scopus all vary to quite some degree. Well, it is the same for citations too, and it comes down to what’s being indexed by each. Scopus indexes 12,850 journals, which is the largest documented number at the moment. PubMed on the other hand has 6000 journals comprising mostly clinical content, and Web of Science offers broader coverage with 8700 journals. However, unless you pay for both Web of Science and Scopus, you won’t be allowed to know who’s re-using work or how much, and even if you are granted access, both services offer inconsistent results. Not too useful when these numbers matter for impact assessment criteria and your career.
Google Scholar, however, offers a free citation indexing service, based, in theory, on all published journals, and possibly with a whole load of ‘grey literature’. For the majority of researchers now, Google Scholar is the go-to powerhouse search tool. Accompanying this power though is a whole web of secrecy: it is unknown who Google Scholar actually crawls, but you can bet they reach pretty far given by the amount of self-archived, and often illegally archived, content they return from searches. So the basis of their citation index is a bit of mystery and lacking any form of quality control, and confounded by the fact that it can include citations from non-peer-reviewed works, which will be an issue for some.
Academic citations represent the structured genealogy or network of an idea, and the association between themes or topics. I like to think that citation counts tell us how imperfect our knowledge is in a certain area, and how much researchers are working to change that. Researchers quite like citations; we like to know how many citations we’ve got, and who it is who’s citing and re-using our work. These two concepts are quite different: re-use can be reflected by a simple number, which is fine in a closed system. But to get a deeper context of how research is being re-used and to trace the genealogy of knowledge, you need openness.
At ScienceOpen, we have our own way to measure citations. We’ve recently implemented it, and are only just beginning to realise the importance of this metric. We’re calling it the Open Citation Index, and it represents a new way to measure the retrieval of scientific information.
But what is the Open Citation Index, and how is it calculated? The core of ScienceOpen is based on a huge corpus of open access articles drawn primarily from PubMed Central and arXiv. This forms about 2 million open access records, and each one comes with its own reference list. What we’ve done using a clever metadata extraction engine is to take each of these citations and create an article stub for them. These stubs, or metadata records, form the core of our citation network. The number of citations derived from this network are displayed on each article, and each item that cites another can be openly accessed from within our archive.
So the citation counts are based exclusively on open access publications, and therefore provide a pan-publisher, article-level measure of how ‘open’ your idea is. Based on the way these data are gathered, it also means that every article record has had at least one citation, and therefore we explicitly provide a level of cross-publisher content filtering. It is pertinent that we find ways to measure the effect of open access, and the Open Citation Index provides one way to do this. For researchers, the Open Citation Index is about gaining prestige in a system that is gradually, but inevitably and inexorably, moving towards ‘open’ as the default way of conducting research.
In the future, we will work with publishers to combine their content with our archives and enhance the Open Citation Index, developing a richer, increasingly transparent and more precise metric of how research is being re-used.
The amount of published scientific research is simply enormous. Current estimates are over 70 million individual research articles, with around 2 million more being published every year. We are in the midst of an information revolution, with the World Wide Web offering rapid, structured and practical distribution of knowledge. But for researchers, this creates the monolith task of manually finding relevant content to fuel their work, and begs the question, are we doing the best we can to leverage this knowledge?
There are already several well-established searchable archives, scientific databases representing warehouses for all of our knowledge and data. The most well-known include the Web of Science, Scopus, PubMed, and Google Scholar, which together are the de facto mode for current methods of information retrieval. The first two of these are paid services, and attempts to replicate searches between all platforms produce inconsistent results (e.g., Bakkalbasi et al., Kulkarni et al.), raising questions about each of their methods of procurement. The search algorithms for each are also fairly opaque, and the relative reliability of each is quite uncertain. Each of them, though, have their own benefits and pitfalls, which are far better discussed elsewhere (e.g. Falagas et al.).
So where does this leave discoverability for researchers in a world that is becoming more and more ‘open’?
Traditional models of peer review occur pre-publication by selected referees and are mediated by an Editor or Editorial Board. This model has been adopted by the vast majority of journals, and acts as the filter system to decide what is considered to be worthy of publication. In this traditional pre-publication model, the majority of reviews are discarded as soon as research articles become published, and all of the insight, context, and evaluation they contain are lost from the scientific record.
Several publishers and journals are now taking a more adventurous exploration of peer review that occurs subsequent to publication. The principle here is that all research deserves the opportunity to be published, and the filtering through peer review occurs subsequent to the actual communication of research articles. Numerous venues now provide inbuilt systems for post-publication peer review, including ScienceOpen, RIO, The Winnower, and F1000 Research. In addition to those adopted by journals, there are other post-publication annotation and commenting services such as hypothes.is and PubPeer that are independent of any specific journal or publisher and operate across platforms.