MIT researchers are hoping to advance the democratization of data science with a new tool for nonstatisticians that automatically generates models for analyzing raw data. Democratizing data science is the notion that anyone, with little to no expertise, can do data science if provided ample data and user-friendly analytics tools. Supporting that idea, the new tool ingests datasets and generates sophisticated statistical models typically used by experts to analyze, interpret, and predict underlying patterns in data. The tool currently lives on Jupyter Notebook, an open-source web framework that allows users to run programs interactively in their browsers. Users need only write a few lines of code to uncover insights into, for instance, financial trends, air travel, voting patterns, the spread of disease, and other trends.
A great way to understand the future priorities for a company is to see where they invest resources. When you look at where Toyota, the Japanese industry giant, has recently invested, it's clear the company is preparing to remain relevant and competitive in the 4th industrial revolution as a result of its investments and innovation in artificial intelligence, big data and robots. With initial funding of $100 million, Toyota AI Ventures invests in tech start-ups and entrepreneurs around the world that are committed to autonomous mobility, data and robotics. Toyota's investments help accelerate getting critical new technologies to market. One of the organization's investments is in May Mobility, a company that is developing self-driving shuttles for college campuses and other areas such as central business districts where low-speed applications are warranted.
Using a highly sophisticated form of pattern matching, researchers from Florida Atlantic University's College of Engineering and Computer Science are teaching "machines" to detect Medicare fraud. About $19 billion to $65 billion is lost every year because of Medicare fraud, waste, or abuse. Like the proverbial "needle in a haystack," human auditors or investigators have the painstaking task of manually checking thousands of Medicare claims for specific patterns that could indicate foul play or fraudulent behaviors. Furthermore, according to the U.S. Department of Justice, right now fraud enforcement efforts rely heavily on health care professionals coming forward with information about Medicare fraud. "The Effects of Varying Class Distribution on Learner Behavior for Medicare Fraud Detection With Imbalanced Big Data," published in the journal Health Information Science and Systems, uses big data from Medicare Part B and employs advanced data analytics and machine learning to automate the fraud detection process.
What will be the next thing to revolutionize data science in 2019? Reinforcement learning will be the next big thing in data science in 2019. While RL has been around for a long time in academia, it has hardly seen any industry adoption at all. Why? Partly because there have been plenty of low-hanging fruits to pick in predictive analytics, but mostly because of the barriers in implementation, knowledge and available tools. The potential value in using RL in proactive analytics and AI is enormous, but it also demands a greater skillset to master.
The MIT Statistics and Data Science Center (SDSC), a part of the Institute for Data, Systems, and Society (IDSS), announced two new academic programs today: the MicroMasters program in Statistics and Data Science, and the Interdisciplinary Doctoral Program in Statistics, both beginning in the fall. The MicroMasters program, currently under development by MIT faculty, will be offered online through edX. "Digital technologies are enabling us to bring MIT's data science curriculum to learners around the world regardless of their location or socioeconomic status," says Vice President for Open Learning Sanjay Sarma. The curriculum includes foundational knowledge of data science methods and tools, a deep dive into probability and statistics, and opportunities to learn, implement, and experiment with data analysis techniques and machine learning algorithms. "The demand for data scientists is growing rapidly," says Dean for Digital Learning Krishna Rajagopal.
Two hundred students, industry professionals, and academic leaders convened at the Microsoft NERD Center in Cambridge, Massachusetts for the second annual Women in Data Science (WiDS) conference on March 5. The conference grew from 150 participants last year, and highlighted local strength in academics and health care. "The WiDS conference highlighted female leadership in data science in the Boston area," said Caroline Uhler, a member of the WiDS steering committee who is an IDSS core faculty member and assistant professor of electrical engineering and computer science (EECS) at MIT. "This event is particularly important to encourage more female scientists in related areas to join this emerging area that has such broad societal impact." Regina Barzilay, Delta Electronics Professor of EECS, gave the first presentation on how data science and machine learning approaches are improving cancer research. Barzilay said her experiences as a breast cancer survivor motivates her work.
Knowing how to write high quality software -- the days of one team writing throwaway models and another team implementing them in production are slowly coming to an end. With programming languages like Python and R and their packages making it easy to work with data and models, it is reasonable to expect a data scientist or machine learning engineer to attain a high level of programming proficiency and understand the basics of system design. While "big data" is a term used way too often, it is true that the cost of data storage is on a dramatic downward trend. This means that there are more and more data sets from different domains to work with and apply models to. And yes, knowing something about at least one of the popular areas of the field that have gotten traction lately -- deep learning for computer vision and perception, recommendation engines, NLP -- would be a great thing once you have the fundamental understanding and technical proficiency.
There has never been a better time to be a politician. But it's an even better time to be a machine learning engineer working for a politician. Throughout modern history, political candidates have had only a limited number of tools to take the temperature of the electorate. More often than not, they've had to rely on instinct rather than insight when running for office. Now big data can be used to maximise the effectiveness of a campaign.
Numerai is a hedge fund that's using technology to create an unprecedented network effect, and transform the way money is managed. Crowdsourced investment strategies are many and varied, but Numerai crowdsources machine intelligence in a totally unique way by supplying its network of data scientists with encrypted data on which to test their machine learning models, thus removing any bias attached to the application of the algorithms. These models are entered into a monthly tournament and the best ones receive a pay-out. This was previously done using Bitcoin (because it was efficient and more anonymous than PayPal), but more recently Numerai launched its own token, Numeraire (NMR), on Ethereum, the public blockchain which has spawned a multitude of trustless, decentralized applications. The aim of the token was to create more value for Numerai's growing network of scientists, and further align them with the collaborative goals of the project.
Machine Learning is the buzzword of the moment. In recent years, news stories raving about its possibilities have soared, Google searches for the term have quadrupled, and companies across the globe have been scrambling to figure out how to capitalize on the excitement by bringing it into their product mix. While that can be a great thing, claims made by some businesses about what Machine Learning can do are wildly exaggerated. That makes it crucial to cut through the noise and get to grips with its potential, limitations, and what you can realistically achieve with your resources so that any investment makes solid business sense -- so say Philip Lima, CEO of Mashey, and Boaz Farkash, Head of Product Management at Sisense. The pair joined forces to deliver an in-depth webinar on Machine Learning and business intelligence, which you can view in full here.