If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
When I was the Vice President of Advertiser Analytics at Yahoo, this became a key focus guiding the analytics that we were delivering to advertisers to help them optimize their spend across the Yahoo ad network. Advertisers had significant untapped advertising and marketing spend into which we were not tapping because we could not deliver audience, content and campaign insights to help them spend that money with us. And the MOTT was huge. Now here I am again, and I'm again noticing this massive "Money on the Table" (MOTT) economic opportunity across all companies – orphaned analytics. Orphaned Analytics are one-off analytics developed to address a specific use case but never "operationalized" or packaged for re-use across other organizational use cases.
How are some of the world's largest data analytics providers utilising machine learning to enhance their offerings? Recent research has shown that companies which use analytics for decision making are 6% more profitable than those that don't. Harnessing analytics within business operations can benefit companies in a number of ways, including the capacity to be proactive and anticipate needs, mitigate risks, increase product quality and personalisation and optimise the customer experience. As a result of these benefits, the technology industry has seen giants such as Microsoft, Amazon and IBM ramp up their investments in Big Data with the sector expected to reach over US$273mn in value by 2023. What is machine learning and how can it be applied to data analytics?
In the last few months, two "predictive" documents found their way into our hands. The first one is the 2016 NMC/CoSN Horizon report for elementary and secondary education and the second is the SURF Trend report 2016: How technological trends enable customised education. Both are very interesting and well-written reports. However they're also a bit tricky in that they're not really underpinned by concrete evidence from the educational sciences and therefore, their predictions are in our opinion a bit like reading tea leaves: They're very visible, but what do they mean? As a preamble to discussing the SURF Trend report 2016 an aside to frame some background.
When Michelangelo started, the most urgent and highest impact use cases were some very high scale problems, which led us to build around Apache Spark (for large-scale data processing and model training) and Java (for low latency, high throughput online serving). This structure worked well for production training and deployment of many models but left a lot to be desired in terms of overhead, flexibility, and ease of use, especially during early prototyping and experimentation [where Notebooks and Python shine]. Uber expanded Michelangelo "to serve any kind of Python model from any source to support other Machine Learning and Deep Learning frameworks like PyTorch and TensorFlow [instead of just using Spark for everything]." So why did Uber (and many other tech companies) build its own platform and framework-independent machine learning infrastructure? The posts How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka and Using Apache Kafka to Drive Cutting-Edge Machine Learning describe the benefits of leveraging the Apache Kafka ecosystem as a central, scalable, and mission-critical nervous system. It allows real-time data ingestion, processing, model deployment, and monitoring in a reliable and scalable way. This post focuses on how the Kafka ecosystem can help solve the impedance mismatch between data scientists, data engineers, and production engineers. By leveraging it to build your own scalable machine learning infrastructure and also make your data scientists happy, you can solve the same problems for which Uber built its own ML platform, Michelangelo.
Both approaches have their pros and cons. The blog post Machine Learning and Real-Time Analytics in Apache Kafka Applications and the Kafka Summit presentation Event-Driven Model Serving: Stream Processing vs. RPC with Kafka and TensorFlow discuss this in detail. There are more and more applications where the analytic model is directly embedded into the event streaming application, making it robust, decoupled, and optimized for performance and latency. The model can be loaded into the application when starting it up (e.g., using the TensorFlow Java API). Model management (including versioning) depends on your build pipeline and DevOps strategy. For example, new models can be embedded into a new Kubernetes pod which simply replaces the old pod. Another commonly used option is to send newly trained models (or just the updated weights or hyperparameters) as a Kafka message to a Kafka topic.
The relationship between Apache Kafka and machine learning (ML) is an interesting one that I've written about quite a bit in How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka and Using Apache Kafka to Drive Cutting-Edge Machine Learning. This blog post addresses a specific part of building a machine learning infrastructure: the deployment of an analytic model in a Kafka application for real-time predictions. Model training and model deployment can be two separate processes. However, you can also use many of the same steps for integration and data preprocessing because you often need to perform the same integration, filter, enrichment, and aggregation of data for model training and model inference. We will discuss and compare two different options for model deployment: model servers with remote procedure calls (RPCs), and natively embedding models into Kafka client applications.
Imagine an advanced fighter aircraft is patrolling a hostile conflict area and a bogie suddenly appears on radar accelerating aggressively at them. The pilot, with the assistance of an Artificial Intelligence co-pilot, has a fraction of a second to decide what action to take – ignore, avoid, flee, bluff, or attack. The costs associated with False Positive and False Negative are substantial – a wrong decision that could potentially provoke a war or lead to the death of the pilot. What is one to do…and why? No one less than the Defense Advanced Research Projects Agency (DARPA) and the Department of Defense (DoD) are interested in not only applying AI to decide what to do in hostile, unstable and rapidly devolving environments but also want to understand why an AI model recommended a particular action.
Deep learning has achieved astonishing results on many tasks with large amounts of data and generalization within the proximity of training data. For many important real-world applications, these requirements are unfeasible and additional prior knowledge on the task domain is required to overcome the resulting problems. In particular, learning physics models for model-based control requires robust extrapolation from fewer samples - often collected online in real-time - and model errors may lead to drastic damages of the system. Directly incorporating physical insight has enabled us to obtain a novel deep model learning approach that extrapolates well while requiring fewer samples. As a first example, we propose Deep Lagrangian Networks (DeLaN) as a deep network structure upon which Lagrangian Mechanics have been imposed. DeLaN can learn the equations of motion of a mechanical system (i.e., system dynamics) with a deep network efficiently while ensuring physical plausibility. The resulting DeLaN network performs very well at robot tracking control. The proposed method did not only outperform previous model learning approaches at learning speed but exhibits substantially improved and more robust extrapolation to novel trajectories and learns online in real-time.
We expect the landscape to be an integrated edge-to-core-to-cloud solution enabling what today is called IoT, Big Data, Fast Data and AI. Each time a promising new technology emerges, we seem to go through a period where it is proposed to be the solution to everything--until we reconcile how that technology fits into the bigger picture. Such is the case with artificial intelligence (AI). Clearly the advancements in deep learning will create new classes of solutions but rather than being a standalone solution, we are just now beginning to see how it fits into our IT landscape. AI emerges at a time when several other shifts in analytics technology are occurring.
For many years, AI/ML has been used to establish the identity of perpetrators, the perpetrators' whereabouts at the time of a criminal act and their actions and whereabouts prior to and following a criminal act. By hand, these are arduous tasks but AI categorization sifting through massive amounts of visual data along with ML behavior scripts AI/ML algorithms can eliminate human errors especially in witness identification and therefore increasing arrest accuracy. "Predictive policing" is the practice of identifying the date, times and locations where specific crimes are most likely to occur, then scheduling officers to patrol those areas in hopes of preventing crimes from taking place, therefore keeping neighborhoods safer. After much research and input from major police departments in cooperation with software suppliers, predictive analytic models have been continuously refined. A profile matrix can be constructed from a database containing known associates, possible DNA found at the scene, gunshot detection, etc.