Building a simple but powerful recommendation system is much easier than you think. This guide explains innovations that make machine learning practical for business production settingsand demonstrates how even a small-scale development team can design an effective large-scale recommender. In this guide, Practical Machine Learning: Innovations in Recommendation, authors and Mahout committers Ted Dunning and Ellen Friedman shed light on a more approachable recommendation engine design and the business advantages for leveraging this innovative implementation style.
Machine learning (ML) is essentially a rules-based automation engine. An ML system is trained via rules definition to look for certain conditions and perform a pre-defined set of actions. Artificial Intelligence is built on an ML engine, but is focused on making comparisons, evaluating relevance and projecting potential outcomes as a means of initiating the best available set of actions (not just those scenarios for which it was programmed). This is important because AI is capable of interpreting spoken language (sounds), images (both still and video) and inferring data relationships that may not be explicitly defined. Conversely, ML just executes defined rules.
We are pleased to launch PennAI – an accessible artificial intelligence system and open-source software developed by at the University of Pennsylvania's Perelman School of Medicine by faculty, staff, and students from the Penn Institute for Biomedical Informatics (IBI). The components of PennAI include a human engine (i.e., the user); a user-friendly interface for interacting with the AI; a machine learning engine for data mining; a controller engine for launching jobs and keeping track of analytical results; a graph database for storing data and results (i.e., the memory); an AI engine for monitoring results and automatically launching or recommending new analyses; and a visualization engine to displaying results and analytical knowledge. This AI system provides a comprehensive set of integrated components for automated machine learning (AutoML), thus providing a data science assistant for generating useful results from large and complex data problems. More details can be found in our PennAI publications.
Many articles have been written about the top machine learning algorithms: click here and here for instance. Most of them seem to define top as oldest, and thus most used, ignoring modern, efficient algorithms fit for big data, such as indexation, attribution modeling, collaborative filtering, or recommendation engines used by companies such as Amazon, Google, or Facebook. I received this morning and advertisement for a (self-published) book called Master Machine Learning Algorithms, and I could not resist to post the author's list of top 10 machine learning algorithms:: Some of these techniques such as Naive Bayes (variables are almost never uncorrelated), Linear Discriminant Analysis (clusters are almost never separated by hyperplanes), or Linear Regression (numerous model assumptions - including linearity - are almost always violated in real data) have been so abused that I would hesitate teaching them. This is not a criticism of the book; most textbooks mention pretty much the same algorithms, and in this case, even skipping all graph-related algorithms. Even k Nearest Neighbors have modern, fast implementations not covered in traditional books - we are indeed working on this topic and expect to have an article published shortly about it.