A machine learning model that trained itself to spot research that would go on to have the highest impact managed to predict 19 out of 20 seminal breakthroughs in a discipline from the past 40 years.
The Dynamic Early-warning by Learning to Predict High Impact (Delphi) system analysed almost 1.7 million papers published in biotechnology since 1980 to learn what was associated with the papers with the biggest impact after five years.
Underlying the model was a complex “heterogenous graph” database that enabled the machine learning to assess the links between the papers across 29 different “features” such as authorship, journal of publication, citations and research networks.
Researchers based at the Massachusetts Institute of Technology who built the model then showed that it could correctly identify almost all “seminal biotechnologies” produced over the period in a “blinded” test.
They also used the system to flag 50 papers from 2018 predicted to be in the top 5 per cent in the future.
According to an article giving an initial demonstration of Delphi, published in Nature Biotechnology, the model was also better than more widely used metric-based assessments at spotting “hidden gem” papers that may have had low citation counts in the initial period after publication.
“When we apply the Delphi approach to a set of papers in biotechnology, we are able to substantially outperform previous citation and handcrafted systems for impact prediction,” the paper says.
The authors add that such a system had the potential to aid funders and research evaluators in making better decisions and avoiding the kind of biases and gaming that occurred with simpler metric assessments.
For example, they say, using citation counts two years after publication to find the highest impact papers would produce around 40 per cent “false positives”, with this rate cut in half by using Delphi.
“As with all machine learning-based systems, care must be taken to ensure that these methods reduce (and do not unintentionally aggravate) latent systemic biases and also do not provide opportunities for malicious actors to manipulate the system for their own gain,” they add.
But, the paper says, “by considering a broad range of features and using only those that hold real signal about future impact, we think that Delphi holds the potential to reduce bias by obviating reliance on simpler (and often reputation-related) metrics”.
“By computationally digesting, at scale, the vast amount of information contained in the scientific enterprise, we might be able to allocate our limited resources in a more efficient, fair and productive manner and thus increase the return on the resources that are collectively deployed into science and technology,” the authors conclude.
James Weis, a research affiliate of the MIT Media Lab, said a model like Delphi could be “naturally extensible to other research fields” by extending the database underpinning the system and adapting to different proxies for research quality.
“Current approaches to measuring research quality – unintentionally but inevitably – have biases that prevent resources from flowing to the most deserving projects or people,” he said.
“We are attempting to use data science to address these systemic inefficiencies – by building tools that help us fund the ‘hidden gems’ that could otherwise be missed or underfunded.”