Text Mining of Scientific Literature Can Lead to New Discoveries
Berkeley Lab researchers (from left) Vahe Tshitoyan, Anubhav Jain, Leigh Weston, and John Dagdelen used machine learning to analyze 3.3 million abstracts from materials science papers. Researchers at the U.S. Department of Energy's Lawrence Berkeley National Laboratory have shown that an algorithm with no training in materials science can scan the text of millions of papers and uncover new scientific knowledge. A team led by Anubhav Jain, a scientist in Berkeley Lab's Energy Storage & Distributed Resources Division, collected 3.3 million abstracts of published materials science papers and fed them into an algorithm called Word2vec. By analyzing relationships between words the algorithm was able to predict discoveries of new thermoelectric materials years in advance and suggest as-yet unknown materials as candidates for thermoelectric materials. "Without telling it anything about materials science, it learned concepts like the periodic table and the crystal structure of metals," says Jain. "That hinted at the potential of the technique. But probably the most interesting thing we figured out is, you can use this algorithm to address gaps in materials research, things that people should study but haven't studied so far."
Jul-8-2019, 15:20:11 GMT
- AI-Alerts:
- 2019 > 2019-07 > AAAI AI-Alert for Jul 9, 2019 (1.00)
- Country:
- North America > United States (0.76)
- Genre:
- Research Report > New Finding (0.66)
- Industry:
- Technology: