Throughout the research world, artificial intelligence is increasingly being applied to scanning complicated scientific literature more quickly than humans alone can do. At Utrecht University, Prof. Rens van de Schoot and his team are part of an international research community now applying that technology to COVID-19 publications. In an edited email exchange with Diane M. Fresquez of Science Business, van de Schoot talks about his work, and search for collaborators (have you got coding talent?) – initially, while under lockdown with his three children, aged six and under, who played quietly (or not so quietly) underfoot. Q. Tell us about your COVID-19 project. With an increase in COVID-19 research literature, and an urgency to find cures and treatments, it is essential that data collection is done real-time.
To help researchers conduct a systematic review or meta-analysis as efficiently and transparently as possible, we designed a tool to accelerate the step of screening titles and abstracts. For many tasks—including but not limited to systematic reviews and meta-analyses—the scientific literature needs to be checked systematically. Scholars and practitioners currently screen thousands of studies by hand to determine which studies to include in their review or meta-analysis. This is error prone and inefficient because of extremely imbalanced data: only a fraction of the screened studies is relevant. The future of systematic reviewing will be an interaction with machine learning algorithms to deal with the enormous increase of available text. We therefore developed an open source machine learning-aided pipeline applying active learning: ASReview. We demonstrate by means of simulation studies that active learning can yield far more efficient reviewing than manual reviewing while providing high quality. Furthermore, we describe the options of the free and open source research software and present the results from user experience tests. We invite the community to contribute to open source projects such as our own that provide measurable and reproducible improvements over current practice. It is a challenging task for any research field to screen the literature and determine what needs to be included in a systematic review in a transparent way. A new open source machine learning framework called ASReview, which employs active learning and offers a range of machine learning models, can check the literature efficiently and systemically.
The current flood of information in all areas of machine learning research, from computer vision to reinforcement learning, has made it difficult to make aggregate scientific inferences. It can be challenging to distill a myriad of similar papers into a set of useful principles, to determine which new methodologies to use for a particular application, and to be confident that one has compared against all relevant related work when developing new ideas. However, such a rapidly growing body of research literature is a problem that other fields have already faced - in particular, medicine and epidemiology. In those fields, systematic reviews and meta-analyses have been used exactly for dealing with these issues and it is not uncommon for entire journals to be dedicated to such analyses. Here, we suggest the field of machine learning might similarly benefit from meta-analysis and systematic review, and we encourage further discussion and development along this direction.
Meta-analyses were supposed to end scientific debates. Often, they only cause more controversy. Meta-analyses have failed to settle the question of whether violent video games such as Grand Theft Auto cause aggression. After Nikolas Cruz killed 17 students and teachers and wounded 17 others early this year at Marjory Stoneman Douglas High School in Parkland, Florida, President Donald Trump had a theory about the underlying causes. "I'm hearing more and more people say the level of violence on video games is really shaping young people's thoughts," he tweeted. He wasn't the only one to make the connection.
Literature reviews can be time-consuming and tedious to complete. By cataloging and refactoring three state-of-the-art active learning techniques from evidence-based medicine and legal electronic discovery, this paper finds and implements FASTREAD, a faster technique for studying a large corpus of documents. This paper assesses FASTREAD using datasets generated from existing SE literature reviews (Hall, Wahono, Radjenovi\'c, Kitchenham et al.). Compared to manual methods, FASTREAD lets researchers find 95% relevant studies after reviewing an order of magnitude fewer papers. Compared to other state-of-the-art automatic methods, FASTREAD reviews 20-50% fewer studies while finding same number of relevant primary studies in a systematic literature review.