If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
Artificial intelligence these days is sold as if it were a magic trick. Data is fed into a neural net – or black box – as a stream of jumbled numbers, and voilà! It comes out the other side completely transformed, like a rabbit pulled from a hat. That's possible in a lab, or even on a personal dev machine, with carefully cleaned and tuned data. However, it is takes a lot, an awful lot, of effort to scale machine-learning algorithms up to something resembling a multiuser service – something useful, in other words.
With growing interest in neural networks and deep learning, individuals and companies are claiming ever-increasing adoption rates of artificial intelligence into their daily workflows and product offerings. Coupled with breakneck speeds in AI-research, the new wave of popularity shows a lot of promise for solving some of the harder problems out there. That said, I feel that this field suffers from a gulf between appreciating these developments and subsequently deploying them to solve "real-world" tasks. A number of frameworks, tutorials and guides have popped up to democratize machine learning, but the steps that they prescribe often don't align with the fuzzier problems that need to be solved. This post is a collection of questions (with some (maybe even incorrect) answers) that are worth thinking about when applying machine learning in production.
With growing interest in neural networks and deep learning, individuals and companies are claiming ever-increasing adoption rates of artificial intelligence into their daily workflows and product offerings. Spending some time on planning your infrastructure, standardizing setup and defining workflows early-on can save valuable time with each additional model that you build. After building, training and deploying your models to production, the task is still not complete unless you have monitoring systems in place. Periodically saving production statistics (data samples, predicted results, outlier specifics) has proven invaluable in performing analytics (and error postmortems) over deployments.
Please read on for my discussion of some of the limitations of the technique, and how we solve the problem for impact coding (also called "effects codes"), and a worked example in R.We define a nested model as any model where the results of a sub-model are used as inputs for a later model. And I now think such a theorem would actually have fairly unsatisfying statement as a one possible "bad real world data" situation violates the usual "no re-use" requirements of differential privacy; duplicated or related columns or variables break the Laplace noising technique. But library code needs to work in the limit (as you don't know ahead of time what users will throw at it) and there are a lot of mechanisms that do produce duplicate, near-duplicate, and related columns in data sources used for data science (one of the difference between data science and classical statistics is data science tends to apply machine learning techniques on very under-curated data sets). The results on our artificial "each column five times" data set are below: Notice that the Laplace noising technique test performances are significantly degraded (performance on held-out test usually being a better simulation of future model performance than performance on the training set).
Often the complexity a machine learning algorithms is in the model training, not in making predictions. I also strongly recommend gathering outlier and interesting cases from operations over time that produce unexpected results (or break the system). Like a ratchet, consider incrementally updating performance requirements as model performance improves. If you're interested in more information on operationalizing machine learning models check out the post: This is more on the Google-scale machine learning model deployment.
In this special guest feature, Victor Amin, Data Scientist at SendGrid, advises that businesses implementing machine learning systems focus on data quality first and worry about algorithms later in order to ensure accuracy and reliability in production. At SendGrid, Victor builds machine learning models to predict engagement and detect abuse in a mailstream that handles over a billion emails per day. The training set (the data your machine learning system learns from) is the most important part of any machine learning system. Instead, build a system that samples production data, and have a mechanism for reliably labeling your sampled production data that isn't your machine learning model.
"DSSTNE (pronounced "Destiny") is an open source software library for training and deploying deep neural networks using GPUs. Amazon engineers built DSSTNE to solve deep learning problems at Amazon's scale. DSSTNE is built for production deployment of real-world deep learning applications, emphasizing speed and scale over experimental flexibility. "Deep Scalable Sparse Tensor Network Engine, (DSSTNE), pronounced "Destiny", is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models.