Berkeley Lab researchers (from left) Vahe Tshitoyan, Anubhav Jain, Leigh Weston, and John Dagdelen used machine learning to analyze 3.3 million abstracts from materials science papers. Researchers at the U.S. Department of Energy's Lawrence Berkeley National Laboratory have shown that an algorithm with no training in materials science can scan the text of millions of papers and uncover new scientific knowledge. A team led by Anubhav Jain, a scientist in Berkeley Lab's Energy Storage & Distributed Resources Division, collected 3.3 million abstracts of published materials science papers and fed them into an algorithm called Word2vec. By analyzing relationships between words the algorithm was able to predict discoveries of new thermoelectric materials years in advance and suggest as-yet unknown materials as candidates for thermoelectric materials. "Without telling it anything about materials science, it learned concepts like the periodic table and the crystal structure of metals," says Jain. "That hinted at the potential of the technique. But probably the most interesting thing we figured out is, you can use this algorithm to address gaps in materials research, things that people should study but haven't studied so far."
Demand for more powerful big data analytics solutions has spurred the development of novel programming models, abstractions, and platforms for next-generation systems. For these problems, a complete solution would address data wrangling and processing, and it would support analytics over data of any modality or scale. It would support a wide array of machine learning algorithms, but also provide primitives for building new ones. It would be customizable, scale to vast volumes of data, and map to modern multicore, GPU, coprocessor, and compute cluster hardware. In pursuit of these goals, novel techniques and solutions are being developed by machine learning researchers,4,6,7 in the database and distributed systems research communities,2,5,8 and by major players in industry.1,3
More than a decade ago, Ichiro Takeuchi, professor of materials science and engineering, started applying the subfield of artificial intelligence (AI) known as machine learning (ML) to help develop new magnetic materials. At the time, ML was not widely used in materials science. "Now, it's all the rage," says Takeuchi, who also holds an appointment with the Maryland Energy Innovation Institute. Its current popularity is due in part to the deep learning revolution of 2012 and related advances in computer chip speed, data storage options, and rapid refinement of the science that drives its predictive analytics of algorithms. ML-based discovery in materials science is not just a lab exercise.
A great way to understand the future priorities for a company is to see where they invest resources. When you look at where Toyota, the Japanese industry giant, has recently invested, it's clear the company is preparing to remain relevant and competitive in the 4th industrial revolution as a result of its investments and innovation in artificial intelligence, big data and robots. With initial funding of $100 million, Toyota AI Ventures invests in tech start-ups and entrepreneurs around the world that are committed to autonomous mobility, data and robotics. Toyota's investments help accelerate getting critical new technologies to market. One of the organization's investments is in May Mobility, a company that is developing self-driving shuttles for college campuses and other areas such as central business districts where low-speed applications are warranted.
What will be the next thing to revolutionize data science in 2019? Reinforcement learning will be the next big thing in data science in 2019. While RL has been around for a long time in academia, it has hardly seen any industry adoption at all. Why? Partly because there have been plenty of low-hanging fruits to pick in predictive analytics, but mostly because of the barriers in implementation, knowledge and available tools. The potential value in using RL in proactive analytics and AI is enormous, but it also demands a greater skillset to master.
Companies in all industries must stay up to date with the latest tech to survive in this digital world. This is especially true in the case of machine learning (ML), which has the potential to transform the way businesses process and use their data. While ML has a number of useful applications in the business world, applying it to business intelligence (BI) insights can help you optimize your processes and make even better decisions. Thirteen members of Forbes Technology Council shared some creative ways to combine business intelligence with machine learning to produce the best results for your company. One of the most unique ways to combine business intelligence and machine learning is the identification of fraud indicators.
One of the formidable challenges healthcare providers face is putting medical data to maximum use. Somewhere between the quest to unlock the mysteries of medicine and design better treatments, therapies, and procedures, lies the real world of applying data and protecting patient privacy. "Today, there are many barriers to putting data to work in the most effective way possible," observes Drew Harris, director of health policy and population health at Thomas Jefferson University's College of Population Health in Philadelphia, PA. "The goals of protecting patients and finding answers are frequently at odds." It is a critical issue and one that will define the future of medicine. Medical advances are increasingly dependent on the analysis of enormous datasets--as well as data that extends beyond any one agency or enterprise.
Two hundred students, industry professionals, and academic leaders convened at the Microsoft NERD Center in Cambridge, Massachusetts for the second annual Women in Data Science (WiDS) conference on March 5. The conference grew from 150 participants last year, and highlighted local strength in academics and health care. "The WiDS conference highlighted female leadership in data science in the Boston area," said Caroline Uhler, a member of the WiDS steering committee who is an IDSS core faculty member and assistant professor of electrical engineering and computer science (EECS) at MIT. "This event is particularly important to encourage more female scientists in related areas to join this emerging area that has such broad societal impact." Regina Barzilay, Delta Electronics Professor of EECS, gave the first presentation on how data science and machine learning approaches are improving cancer research. Barzilay said her experiences as a breast cancer survivor motivates her work.
Knowing how to write high quality software -- the days of one team writing throwaway models and another team implementing them in production are slowly coming to an end. With programming languages like Python and R and their packages making it easy to work with data and models, it is reasonable to expect a data scientist or machine learning engineer to attain a high level of programming proficiency and understand the basics of system design. While "big data" is a term used way too often, it is true that the cost of data storage is on a dramatic downward trend. This means that there are more and more data sets from different domains to work with and apply models to. And yes, knowing something about at least one of the popular areas of the field that have gotten traction lately -- deep learning for computer vision and perception, recommendation engines, NLP -- would be a great thing once you have the fundamental understanding and technical proficiency.
There has never been a better time to be a politician. But it's an even better time to be a machine learning engineer working for a politician. Throughout modern history, political candidates have had only a limited number of tools to take the temperature of the electorate. More often than not, they've had to rely on instinct rather than insight when running for office. Now big data can be used to maximise the effectiveness of a campaign.