Choosing databases for Systematic Reviews

Michael Gusenbauer is an assistant professor at the Institute of Innovation Management (ifi) at Johannes Kepler University (JKU) Linz. Here Michael talks about his recent article with Neal Haddaway on selecting academic search systems for systematic reviews.

You recently published a paper about search facilities used in systematic reviews. What did you find?
We found that popular search systems are very heterogeneous not only in their coverage, but also in their search capabilities. This makes them suitable for systematic reviews to varying degrees.

Our approach was to compare the coverage and search functionalities of a selection of 28 academic search systems. Evaluation profiles for each of those 28 systems allow researchers to assess why and to what degree a particular system is suitable for their search requirements. Most notably we found that only 14 out of the 28 systems are well-suited to be used as principal search systems in systematic searches.

Our findings help reviewers to get a sense for what criteria are essential for evidence identification and to see how their favourite systems perform compared to others.

Why is it important to know how good different search platforms are?
We hope that this research will help researchers to find more relevant results, faster. A significant portion of search system know-how is accumulated through painful trial and error – this is at least was my experience. Our findings help reviewers to reduce some of these learning-by-doing pains, to help them to decide which systems to use and to anticipate issues with specific systems in advance. In the past I had multiple occasions when I realized that my envisioned search strategy was in fact not supported by a system. The later in the process, the more painful this is. By enhancing transparency among search systems, we hope to create awareness for the search requirements of evidence synthesis across disciplines, among journals and among database providers.

Some bibliographic databases are very expensive to use – what can researchers do if they can’t afford them and still want to conduct systematic reviews?
To stay within legal boundaries, researchers from resource-constrained environments need to rely on the few open access systems available.

In our study we analysed five systems that are both free to access and provide exclusively open access content (arXiv, CiteSeerX, ClinicalTrials.gov, DBLP and DOAJ). We found that only ClinicalTrials.gov provides the necessary search functionalities to be used as a principal search system. The other systems lack some crucial functionalities, making it doubtful that the quality criteria of systematic reviews can be warranted if one exclusively relies on these systems. It is always possible to use a system for additional searches, however.

Among our selection of 28 search systems, Bielefeld Academic Search Engine, with approximately 60% multidisciplinary open access content, seems to be currently the best option for synthesis of open access content. Yet, if the goal is to get a complete picture of the available evidence, there is currently no way around proprietary search systems, unfortunately.

How can the whole system of searching for literature in systematic reviews be made better?
In general, there are two factors that influence the quality of evidence-identification in systematic reviews: humans and technology.

First, humans can be trained to improve heuristics to more efficiently and effectively utilize the tools available. This means using the right tools in the right way. It has been shown by multiple studies that trained information specialists participating in evidence-synthesis improves research quality. So, clearly, the human factor is important. At the moment we have the problem that the awareness for the requirements of high-quality evidence-synthesis varies across disciplines. I would say the social sciences lag behind in developing or adopting standards similar to the ones established by Cochrane or Campbell.

Second, technology supporting evidence-synthesis is improving fast: not only for literature search, but also regarding all other stages of conducting systematic reviews, including collaborative assessment, analysis, write-up or presentation. Currently, there are many grass-roots projects that aim to improve aspects of evidence-synthesis, as evidenced by the Evidence Synthesis Hackathon or other open science projects. I believe these efforts are most necessary as systematic reviews are laborious tasks with a lot of data handling and virtual collaboration.

Further, our study shows the importance of improving open access alternatives in terms of search functionalities as well as the quantity of freely accessible content. If we can lower barriers to conduct systematic reviews we have a lot to gain for our collective knowledge. In this regard our paper is somewhere in the middle between the technological and the human factor. All our database-tests were performed from the perspective of regular user-queries, where our approach was to question whether search results were plausible and whether certain features worked to determine what I can(not) do (or find) with a system.

What do you think is the role of new technologies?
I believe technology is essential in evidence synthesis and in research work in general. I guess no one would doubt that. Currently we have a hype around artificial intelligence, a technology that is celebrated as the solution for exploding data piles. For evidence-synthesis we too have to make sense of an exponentially increasing amount of data.

If we stay with the topic of our paper, we see that search giants clearly have set the direction towards artificial intelligence where the algorithm knows better what you want to find than yourself. Already ten years ago Google’s Larry Page postulated: “We want to create the ultimate search engine […] Some people call that artificial intelligence.” Now we have dominant players like Google Scholar, Microsoft Academic, the upcoming Meta (from Facebook) or also Semantic Scholar or Dimensions.ai that use semantic search technologies that interpret user queries according to intransparent cues. Most internet users use semantic search in their daily work (e.g. Google or Google Scholar) and do not bother with what happens behind the curtain as long as search results are fine.

In science we have to adopt a different mindset around search technologies however. As researchers we have to know what we want to achieve with our searches. This is even more true when conducting systematic reviews when our criteria are comprehensiveness, transparency and reproducibility. In this regard we cannot trust the algorithmic black-box, yet must rely on systems we have control over and whose retrieval methods we can question. Yet, many people not experienced with systematic searching are not aware that their go-to search system might not be adequate for every search task.

I strongly believe we need to teach these skills not only for evidence-synthesis research, but in schools and universities in general: how to ask better questions (or queries) in an environment of increasingly overwhelming quantity and diversity of information? What struck me here is some research Google did years ago that found only one in ten American internet users knowing about the search function in their browser (CTRL+F). This means a lot of people search web-pages line-by-line. You have to imagine this waste of time. Search giants respond with simple user interfaces and artificial intelligence to solve this problem, while implicitly accepting the lack of human information retrieval skills. Our effort here goes in the opposite direction: educating users about what is possible with search systems to better take control of their information retrieval process, especially when the requirements are elevated like in systematic reviews.

What got you interested in the search process for systematic reviews?
During my doctoral work I was confronted with a large, heterogeneous knowledge base on ‘offshoring’, a management practice to source activities from abroad. I took much time trying different methods on how to capture and make sense of the evidence-base of a phenomenon that has many different names. I learned that while understanding the phenomenon and its associated language is crucial, it is also crucial to understand how search systems work. I was mostly autodidact and got more and more interested in the differences of search systems, as I got frequently confronted with their limitations. It was a time of time-outs and error messages. I got blocked quite often because systems thought my repetitive queries came from a bot. I took it as a compliment when my tedious queries could match the productivity of a bot! Now I feel content that this work seems to create awareness for differences in search systems among their users.