Blog
About

Finding relevant articles in the information haystack

Image credit: @academicssay, Twitter
Image credit: @AcademicsSay, Twitter

Previously I saw a headline that read “Search is so 2014”! I stopped and questioned whether I agreed with that statement. The article then went on to describe some of the more interesting developments in how to find the “right article in the rapidly growing information haystack” and some of them matched my own picks which include:

  • SNAP from Jstor Labs – a mobile app that allows you to take a picture of any page of text and get a list of research articles from JSTOR on the same topic.
  • Sparrho – a content recommendation engine that aggregates and distills information based on user preferences and makes personalised suggestions. We invited their team to post a guest blog.
  • Knowledge domain visualizations (Peter Kraker, LSE Impact Blog) – present the main areas in a field, and assigns relevant articles to them.

However, I still believe that there is a role for Search in 2015, even as it is eventually replaced or enriched by more sophisticated tools.

The part Search plays here at ScienceOpen is particularly important given that we are just beginning our quest to aggregate the world’s Open Access content in all disciplines. The corpus here is growing (nearly 1.5 million articles from nearly 2.5 million authors). The pace of scientific literature growth is rapid, expanding at the rate of more than 2 articles per minute (Mark2Cure).  Both are good reasons why we have been focusing our development efforts on improving the precision of our search results because to some extent “if you can’t find it, it doesn’t exist”.

For Search to qualify as “good” in my book it needs to be precise, fast and flexible. Here’s my mini review of ScienceOpen Search:

  • Search delivered rapid and accurate results, so two thumbs up here.
  • The results could be parsed using the aggregation source (PubMed Central, ArXiv and ScienceOpen) or the name of the originating journal/publisher.
  • For the geeks among you, our Search is powered by ElasticSearch.
  • When I forgot the exact spelling of an author name, this field offered me possible name options to pick from (nice).
  • As a publisher myself, I had to try searching by company name. I was surprised to find 1555 OA articles by the American Chemical Society(ACS) on our platform. I also found 2816 articles from Elsevier. This is a tiny fraction of their output but at least something is there.
  • In a nod to our belief that Journals will become increasingly less important (and hopefully the strangle hold of the IF will be released) as researchers aggregate content themselves (for example using our new Collection tool), users can search by Collection (which has it’s own tab).
  • Once you’ve found a relevant article, we provide the XML (and PDF) because let’s be honest, in the digital future, a static PDF probably won’t be of much use.

I want to acknowledge the ScienceOpen Dev team (Raj, Ed and X, led by Tibor) for their excellent work on this release.