In:  Aggregation  

ScienceOpen smashes through the 20 million article record mark

Today, we are pleased to announce that ScienceOpen hit the 20 million article record! In fact, it’s still climbing even as this is being written. This is thanks to what we call our ‘aggregation’ engine, which takes published research articles from any field, and applies a little bit of magic to them to open up their context and let us all do amazing things, such as find similar articles, post-publication peer review them, and trace their citation genealogies.

I asked Alexander Grossmann, professor of publishing and co-founder of ScienceOpen, what this milestone means to him and to open science more broadly.

How does it feel to have reached such a major milestone at ScienceOpen?

It’s terrific to have achieved this step so quickly after we managed to aggregate 10 million article records a few months ago.

20 million screenshot

Why is it so important that SciELO was the main catalyst for this?

The import of articles from SciELO was not only another major step towards this milestone. I believe that it may also support authors from Latin America to achieve more visibility.

How important do you think it is to break down these sorts of geographical barriers for the progress of open research?

As we all know, the coverage of scholarly articles in other indexing resources as for example Web of Science or Google Scholar is not the same as for research which has been authored by authors from the Western hemisphere. At ScienceOpen, we believe that research from any part of the world is important, and should all be given the same opportunity to be re-used and shared.


How easy was it to integrate the content from SciELO? What exactly did we do with their content?

Both teams from SciELO and ScienceOpen did a great job to make this fantastic success happen. It was some effort but finally we managed to resolve all information which we need to display those articles in the same way as any other content on our platform.

What is aggregation, and why is it so important in the digital research age?

Every year more than 2 million research papers are published in scholarly journals. It becomes therefore more and more difficult for scientists to find the relevant research even in their narrow field of expertise. Moreover there is ongoing effort to deposit research output in repositories which are not tracked by established indexing services. Aggregating this content from different resources as journals, repositories or pre-print servers as the arXiv may become the key to let scholars find not only the most recent research in their discipline but also discover other papers which they haven’t found otherwise.

Why are non-open access articles included alongside those which are fully open?

So far only 15-20 percent of all research has been published as immediate (gold) open access. I am confident that we will have achieved a much higher number within the next few years. This view is shared also by major scholarly publishers. For the moment, however, we should assist researchers to find most relevant content, not only that small fraction of it which is open access.

How important is it for open and non-open content to have well-structured metadata?

Without well-structured metadata it becomes very difficult to show the relevant information. In STM, most major publishers meanwhile provide all articles in an appropriate XML structure, but in other disciplines, as for example the Humanities, the majority of research publications isn’t available as full-text XML or at least in a well-structured format. This is less and less acceptable for researchers and authors should ask for this service from their publishers. Otherwise they will have a rare chance that their work will be found or covered for example by indexing services or in social media.

What is the next milestone for ScienceOpen? And what is your ultimate goal?

We will continue to grow and further increase the usage of ScienceOpen to become the one-stop reference for both researchers and institutions in any discipline. At present already about 70 percent of our users come via Google to ScienceOpen which is a prerequisite to achieve that goal.

Thank you for your time, Alexander! At ScienceOpen, we’re looking forward to the next steps along our open journey. We hope you’ll all join us for it, and help us all to make science more open!