“Search is the new journal!”, was one of the rallying cries at the recent Force11 meeting in Berlin. But what does this mean? Well, we have a bit of a problem in research – there is so much content being published these days, about 2-3 million papers each year from around 50,000 journals! It has never been more crucial to have efficient ways of searching to discover relevant work for your research question. No single human is capable of this alone.
Now, we know Google Scholar is usually everyone’s search engine of choice for research articles. But when you pop in a search term, how do you know what research is good, what’s relevant to you, what people are talking about? You just get an enormous list that trails off with ever-decreasing relevance, and are supposed to be able to figure that all out yourself. We can do better.
Quality and quantity
Efficient search is the core issue that our freely accessible multi-layer discovery engine is helping to solve. The current database at ScienceOpen has more than 36 million article records, and growing at around 100,000 new records each week. Each of these records is linked within the database to other articles through our open citation network.
We use this citation information, and other article metadata, to provide an enriched search ecosystem for users. The purpose of this is to allow users to drill down to relevant research using a range of different contexts and criteria, saving time and energy, and facilitating research discovery at multiple dimensions.
Sort by citation count
Citations are still one of the main forms of ‘academic’ currency in a modern research world. Citations only measure how many times a piece of work has been cited without additional context. As such, they are a simple proxy for ‘scholarly discussion’ of a piece of work, but beyond this are essentially devoid of legitimacy as a metric.
Sorting a search result by citations allows you to see what is most popular in a research context, and which articles have been particularly important in developing new disciplines, ideas, and ways of thinking. Identifying highly-cited articles provides for you a great starting point for further discovery. Citations reveal to you the lineage of ideas – start at the top, and work your way down! Understanding the historical context of ideas is critical for good research, and ScienceOpen helps you to explore this.
Sort by Altmetric score
Altmetric scores are a combined measure of social attention for articles. They give us a nice idea of how much an article is being discussed in news outlets or on social media. If you want to keep up with the buzz in your field, or find out what’s of interest in another, ScienceOpen gives you the tools for that.
Missing an article or citation from ScienceOpen, or want to add more of your own publications? Users can now request articles to be integrated into our database via their dashboard. These can be your own articles, or someone else’s – the choice is yours!
All we need are either a list of:
Simply upload a file or copy and paste them in, click the button and away you go! We’ll send you a notification by email to let you know the status of each article. We’ll work our magic behind the scenes and integrate your selection as soon as is computationally possible.
Boost your citations
One of the great things about this new feature is that you can add a list of DOIs of articles that cite your own work. We provide a free and open citation network for each of our users, based on extracting citation data from peer reviewed publications. Thanks to initiatives like I4OC, it is becoming easier to provide enriched citation information like we do for researchers for free.
By adding research that cites your work, we provide an easy and great way to make sure that your citation profile is complete! This isn’t gaming the system, it’s simply making it comprehensive and open. That’s important. Put this in the context of our recently launched author-metrics, and you’re on to a winning academic profile!
For collection editors
If you have a collection at ScienceOpen, you can specify that these records be automatically integrated into them. You can add these in bulk, with 100 DOIs per request for now. Personalising your collections and making them complete has never been easier! If you want to set up your own collection and try out these features, contact us here!
Integration and validation
By using the new ‘claim authorship’ feature, your articles will be directly integrated with your ScienceOpen profile and ORCID. This provides crucial cross-validation of your research history, a unique feature of ScienceOpen. If you’re adding you own article records, these will be available in your ‘Claim your articles’ section of the Dashboard, where you can easily add them to your profile.
We recognise that no research database is complete, and ScienceOpen is no exception. We work closely with publishers, ORCID, and platforms like PubMed to integrate new content on a daily basis. But we can’t pick up everything, and that’s where you come in!
By adding personalised content, you help us to fill in the blank spots in our database. This helps to enrich our network by putting this content into our semantically linked network. We are currently only indexing research articles and not book chapters, proceedings or other content types.
So pop over to your dashboard, try it out, and let us know what you think!
ScienceOpen joins the growing list of stakeholders who support the I4OC initiative, alongside OASPA (the Open Access Scholarly Publishers Association), Jisc, and the LIBER (the Association of European Research Libraries).
What does ScienceOpen have to do with the I4OC?
The analysis of citation data is at the core of what we do at ScienceOpen. Citations trace academic networks, describe research genealogies, and uncover ideas. They enable a great range of functions dependent on the connections and context that they reveal to us.
There are several ways in which we use citation data at ScienceOpen:
To sort publications by citation numbers. There are nearly always too many papers to read through them all. So every search result list on ScienceOpen can be filtered and sorted by citation numbers to find relevant articles. This powerful filter is supplemented by Altmetric score, usage, date, and more.
To sort reference lists by citation numbers. The reference list of a paper is an important discovery tool for researchers, but often with 50-100 references. The sort and search tools at ScienceOpen allow both a quick overview and in depth searching within the reference section – now for many more papers with the I4OC initiative!
To increase visibility of open content. If your article cites 50 papers, there will be 50 more article pages on ScienceOpen that point back to your original paper. The increased linkage helps to define networks of similarity that show the right paper to the user searching for information.
To provide citation information for any author on our platform. Integrate your ORCID and claim your publications today, and you can track your citations through time.
At ScienceOpen, we’re constantly upgrading our platform to provide the best possible user interaction experience. We get feedback from the research community all the time, and try to adapt to best meet their needs.
So today, we’re happy to announce two neat little features in our latest updates.
Firstly, all Open Access articles now have a cute little symbol next to them, making it even easier for you to discover open content. This shows up on all of our Open Access content across nearly 14 million article records now. Making open content stand out is a great way to encourage others to adopt open practices, as well as help people see which content they can re-use most easily.
As well as this, we have a new browsing function built into our collections. Sometimes, collections are pretty big. Our new SciELO collections have some with tens of thousands of open access articles, and sifting through that manually is not exactly a valuable use of ones time.
With this new function, you can now filter content within collections by journal, publisher, keywords, and even filter them by citations or Altmetric scores. Discovering content relevant to your research should be smart and efficient, and this is what our platform delivers. Try it out on this collection, or build your own!
Context is something we’ve been thinking a lot about at ScienceOpen recently. It comes from the Latin ‘con’ and ‘texere’ (to form ‘contextus’), which means ‘weave together’. The implications for science are fairly obvious: modern research is about weaving together different strands of information, thought, and data to place your results into the context of existing research. This is the reason why we have introductory and discussion sections at the intra-article level.
But what about context at a higher level?
Context can defined as: “The circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood.” Simple follow on questions might be then, what is the context of a research article? How do we define that context? How do we build on that to do science more efficiently? The whole point for the existence of research articles is that they can be understood by as broad an audience as possible so that their re-use is maximised.
There are many things that impinge upon the context of research. Paywalls, secretive and exclusive peer review, lack of discovery, lack of inter-operability, lack of accessibility. The list is practically endless, and a general by-product of a failure for traditional scholarly publishing models to embrace a Web-based era.
Eugene Garfield, one of the founders of biliometrics and scientometrics, once claimed that “Citation indexes resolve semantic problems associated with traditional subject indexes by using citation symbology rather than words to describe the content of a document.” This statement led to the advent and a new dawn of Web-based measurements of citations, implemented as a way to describe the academic re-use of research.
However, Garfield had only reached a partial solution to a problem about measuring re-use, as one of the major problems with citation counts is that they are primarily contextless: they don’t tell us anything about why research is being re-used. Nonetheless, citation counts are now at the very heart of academic systems for two main reasons:
They are fundamental for grant, hiring and tenure decisions.
They form the core of how we currently assess academic impact and prestige.
Working out article-level citation counts is actually pretty complicated though, and depends on where you’re sourcing your information from. If you read the last blog post here, you’ll have seen that search results between Google Scholar, Web of Science, PubMed, and Scopus all vary to quite some degree. Well, it is the same for citations too, and it comes down to what’s being indexed by each. Scopus indexes 12,850 journals, which is the largest documented number at the moment. PubMed on the other hand has 6000 journals comprising mostly clinical content, and Web of Science offers broader coverage with 8700 journals. However, unless you pay for both Web of Science and Scopus, you won’t be allowed to know who’s re-using work or how much, and even if you are granted access, both services offer inconsistent results. Not too useful when these numbers matter for impact assessment criteria and your career.
Google Scholar, however, offers a free citation indexing service, based, in theory, on all published journals, and possibly with a whole load of ‘grey literature’. For the majority of researchers now, Google Scholar is the go-to powerhouse search tool. Accompanying this power though is a whole web of secrecy: it is unknown who Google Scholar actually crawls, but you can bet they reach pretty far given by the amount of self-archived, and often illegally archived, content they return from searches. So the basis of their citation index is a bit of mystery and lacking any form of quality control, and confounded by the fact that it can include citations from non-peer-reviewed works, which will be an issue for some.
Academic citations represent the structured genealogy or network of an idea, and the association between themes or topics. I like to think that citation counts tell us how imperfect our knowledge is in a certain area, and how much researchers are working to change that. Researchers quite like citations; we like to know how many citations we’ve got, and who it is who’s citing and re-using our work. These two concepts are quite different: re-use can be reflected by a simple number, which is fine in a closed system. But to get a deeper context of how research is being re-used and to trace the genealogy of knowledge, you need openness.
At ScienceOpen, we have our own way to measure citations. We’ve recently implemented it, and are only just beginning to realise the importance of this metric. We’re calling it the Open Citation Index, and it represents a new way to measure the retrieval of scientific information.
But what is the Open Citation Index, and how is it calculated? The core of ScienceOpen is based on a huge corpus of open access articles drawn primarily from PubMed Central and arXiv. This forms about 2 million open access records, and each one comes with its own reference list. What we’ve done using a clever metadata extraction engine is to take each of these citations and create an article stub for them. These stubs, or metadata records, form the core of our citation network. The number of citations derived from this network are displayed on each article, and each item that cites another can be openly accessed from within our archive.
So the citation counts are based exclusively on open access publications, and therefore provide a pan-publisher, article-level measure of how ‘open’ your idea is. Based on the way these data are gathered, it also means that every article record has had at least one citation, and therefore we explicitly provide a level of cross-publisher content filtering. It is pertinent that we find ways to measure the effect of open access, and the Open Citation Index provides one way to do this. For researchers, the Open Citation Index is about gaining prestige in a system that is gradually, but inevitably and inexorably, moving towards ‘open’ as the default way of conducting research.
In the future, we will work with publishers to combine their content with our archives and enhance the Open Citation Index, developing a richer, increasingly transparent and more precise metric of how research is being re-used.
The amount of published scientific research is simply enormous. Current estimates are over 70 million individual research articles, with around 2 million more being published every year. We are in the midst of an information revolution, with the World Wide Web offering rapid, structured and practical distribution of knowledge. But for researchers, this creates the monolith task of manually finding relevant content to fuel their work, and begs the question, are we doing the best we can to leverage this knowledge?
There are already several well-established searchable archives, scientific databases representing warehouses for all of our knowledge and data. The most well-known include the Web of Science, Scopus, PubMed, and Google Scholar, which together are the de facto mode for current methods of information retrieval. The first two of these are paid services, and attempts to replicate searches between all platforms produce inconsistent results (e.g., Bakkalbasi et al., Kulkarni et al.), raising questions about each of their methods of procurement. The search algorithms for each are also fairly opaque, and the relative reliability of each is quite uncertain. Each of them, though, have their own benefits and pitfalls, which are far better discussed elsewhere (e.g. Falagas et al.).
So where does this leave discoverability for researchers in a world that is becoming more and more ‘open’?