In my previous post, I suggested that it was possible to provide a road map that would help with the introduction of artificial intelligence, advanced analytics and machine learning into insurance companies. This post outlines the process. The first area to address is applications where the adoption of analytics will have an immediate impact on cost reduction and efficiency. The obvious point is process automation. In many insurance companies, the first projects involving advanced analytics and machine learning models are the digitisation and optimisation of processes.
A few months back, we gave our take on a survey from the O'Reilly folks regarding interest in deep learning. The survey reported that interest was more than latent, but there's little question that the bulk of the action today is in the (relatively) better understood confines of machine learning (ML). So on this go round, O'Reilly jumped into the shallower side of the pond to survey the people who subscribe to its publications and go to its big data-related Strata and AI conferences regarding ML. Before diving in, let's put some perspective on this cohort: it's likely a group that on average is ahead of the curve by virtue of its attendance at these big data events or consumption of O'Reilly learning services that are skewing increasingly toward the AI domain. Nonetheless, it provides a useful counterpoint to their earlier work exploring interest in deep learning.
This post was co-written with Joseph Rocca. Suppose that you are working in a given company and you are asked to create a model that, based on various measurements at your disposal, predicts whether a product is defective or not. You decide to use your favourite classifier, train it on the data and voila: you get a 96.2% accuracy! Your boss is astonished and decides to use your model without any further tests. A few weeks later he enters your office and underlines the uselessness of your model.
You sit down to watch a movie and ask Netflix for help. Zoolander 2?") The Netflix recommendation algorithm predicts what movie you'd like by mining data on millions of previous movie-watchers using sophisticated machine learning tools. And then the next day you go to work and every one of your agencies will make hiring decisions with little idea of which candidates would be good workers; community college students will be largely left to their own devices to decide which courses are too hard or too easy for them; and your social service system will implement a reactive rather than preventive approach to homelessness because they don't believe it's possible to forecast which families will wind up on the streets. You'd love to move your city's use of predictive analytics into the 21st century, or at least into the 20th century. You just hired a pair of 24-year-old computer programmers to run your data science team. But should they be the ones to decide which problems are amenable to these tools? Or to decide what success looks like?
This book is an introductory overview of Ethem's detailed text on ML. The text itself has gotten mostly mixed or bad reviews due to a lot of math and algorithms notated without a lot of detailed explanations, however, this is a general reader intro and doesn't go into math, algos in detail, trees, Bayesian logic or even pseudocode, it is more an up to date overview of the field as it exists at this writing. Alpaydin's expensive text, btw, is also available in a very inexpensive Asian edition here on Amazon if you want to brave that difficult book without a lot of investment (Introduction To Machine Learning 3Rd Edition). The present volume is sortof a "ML for Dummies" only updated for the current craze with big data management. There is a lot of history and background that an experienced ML person will find too basic, but as a High School intro or general interested reader intro it is excellent.
Business leaders have traditionally had a somewhat complicated relationship with technology. Many of them instinctively know that its deployment could be transformative for their business although they lack the deep knowledge required to fully understand how and when to invest. Recent research from Fujitsu suggests that levels of uncertainty around the way businesses should plan for imminent technologically-driven change are so high that business leaders around the world favour a co-ordinated, global approach led by intergovernmental bodies and governments. Whereas I do not think these levels of uncertainty and doubt are going to disappear, I do believe that when it comes to the use of AI, data analytics and data science, 2019 will be the year when we see a sharp increase in its use by organisations of all sizes. Central to the rise of data analytics are open source tools, which I believe are doing more to democratise the field of data science than anything else.
In 2008, Daniel Hulme started Satalia, a company that uses data science, machine learning, and optimization (making the best use of resources) to build customized platforms that solve tough logistics problems involving products, services, and people. Lately, Hulme has spent a good portion of his time explaining the ins and outs of artificial intelligence to other CEOs. He sees a big information gap at the top of most companies -- yet this is where technology investment decisions are made. Misunderstanding AI, Hulme believes, can mean both overestimating its value and underestimating its impact. Satalia's work is a leading example of what AI is currently good at. Not coincidentally, it is also the commercialization of Hulme's research at University College London (UCL), where he is the director of the business analytics master's degree program. Satalia's clients are household names in the U.K.; they include Tesco, DFS, and the British Broadcasting Corporation. PwC's Global CEO Survey: Providing unique insight into the thinking of corporate leaders around the world, PwC's annual Global CEO Survey covers issues such as the prospects for economic growth, the challenges of building a workforce, the threats facing companies today, and the impact of AI. www.ceosurvey.pwc The increasingly competitive market for AI expertise is both a blessing and a curse for Satalia.
There is an important distinction related to data mining. First the difference between mining the data to find patterns and build models, and second using the results of data mining. Data Mining results inform the data mining process itself. Cross-industry standard process for data mining, known as CRISP-DM, is an open standard process model that describes common approaches used by data mining experts. It is the most widely-used analytics model and breaks the process of data mining into six major process.
Despite recent advances and press regarding the field, quantum computing is still veiled in mystery and myth, even within the field of data science and technology. Even those within the field of quantum computing and quantum machine learning are still learning the potential for progress and the stark limitations of current systems. However, quantum computing has arrived in its infancy, and many major companies are pouring money into related R&D efforts. D-Wave's system has been commercially available for a couple of years already (albeit at a price tag of $10 million), and other systems have been opened for research purposes and commercial partnerships with quantum machine learning companies. Quantum computing hardware theoretically can take on several different forms, each of which is suited to a different type of machine learning problem.
One of my favorite places to learn data science is an under-the-radar educational website, DataCamp. DataCamp doesn't get nearly the attention that some of the larger, more well-funded online coding schools get, but, I often find myself on one of their tutorials whenever I'm learning something new related to statistics or machine learning. Over the past few months, I've dedicated at least a few hours a week to learning the underpinnings of automation and, where I find something interesting, to blog about my experience. Unlike almost every other school or tutorial I've encountered, DataCamp has a delightfully distinct and powerful approach to education: every single piece of instruction is paired with a simple example and interactive tutorial. There are no long lectures; there are no complicated diagrams.