Performance Analysis
Support Vector Algorithms for Optimizing the Partial Area Under the ROC Curve
Narasimhan, Harikrishna, Agarwal, Shivani
The area under the ROC curve (AUC) is a widely used performance measure in machine learning. Increasingly, however, in several applications, ranging from ranking to biometric screening to medicine, performance is measured not in terms of the full area under the ROC curve, but in terms of the \emph{partial} area under the ROC curve between two false positive rates. In this paper, we develop support vector algorithms for directly optimizing the partial AUC between any two false positive rates. Our methods are based on minimizing a suitable proxy or surrogate objective for the partial AUC error. In the case of the full AUC, one can readily construct and optimize convex surrogates by expressing the performance measure as a summation of pairwise terms. The partial AUC, on the other hand, does not admit such a simple decomposable structure, making it more challenging to design and optimize (tight) convex surrogates for this measure. Our approach builds on the structural SVM framework of Joachims (2005) to design convex surrogates for partial AUC, and solves the resulting optimization problem using a cutting plane solver. Unlike the full AUC, where the combinatorial optimization needed in each iteration of the cutting plane solver can be decomposed and solved efficiently, the corresponding problem for the partial AUC is harder to decompose. One of our main contributions is a polynomial time algorithm for solving the combinatorial optimization problem associated with partial AUC. We also develop an approach for optimizing a tighter non-convex hinge loss based surrogate for the partial AUC using difference-of-convex programming. Our experiments on a variety of real-world and benchmark tasks confirm the efficacy of the proposed methods.
How do neurons operate on sparse distributed representations? A mathematical theory of sparsity, neurons and active dendrites
We propose a formal mathematical model for sparse representations and active dendrites in neocortex. Our model is inspired by recent experimental findings on active dendritic processing and NMDA spikes in pyramidal neurons. These experimental and modeling studies suggest that the basic unit of pattern memory in the neocortex is instantiated by small clusters of synapses operated on by localized non-linear dendritic processes. We derive a number of scaling laws that characterize the accuracy of such dendrites in detecting activation patterns in a neuronal population under adverse conditions. We introduce the union property which shows that synapses for multiple patterns can be randomly mixed together within a segment and still lead to highly accurate recognition. We describe simulation results that provide further insight into sparse representations as well as two primary results. First we show that pattern recognition by a neuron with active dendrites can be extremely accurate and robust with high dimensional sparse inputs even when using a tiny number of synapses to recognize large patterns. Second, equations representing recognition accuracy of a dendrite predict optimal NMDA spiking thresholds under a generous set of assumptions. The prediction tightly matches NMDA spiking thresholds measured in the literature. Our model matches many of the known properties of pyramidal neurons. As such the theory provides a mathematical framework for understanding the benefits and limits of sparse representations in cortical networks.
Manufacturing Downtime Cost Reduction with Predictive Maintenance - Arimo
Manufacturers often have to deal with up to 800 hours of downtime annually. On average an automotive manufacturer's TDC is 22,000 per minute; that is 1.3M per month! With the advance of predictive analytics, TDC can easily be reduced however only 14% of the manufacturing industry is taking advantage of its big data, according to a recent survey from MESA. Predictive maintenance is realized through the application of sophisticated machine learning techniques to equipment condition data collected in real-time or near real-time. It is now the new standard for reducing cost, risk and lost production in manufacturing facilities.
Innovated scalable efficient estimation in ultra-large Gaussian graphical models
Large-scale precision matrix estimation is of fundamental importance yet challenging in many contemporary applications for recovering Gaussian graphical models. In this paper, we suggest a new approach of innovated scalable efficient estimation (ISEE) for estimating large precision matrix. Motivated by the innovated transformation, we convert the original problem into that of large covariance matrix estimation. The suggested method combines the strengths of recent advances in high-dimensional sparse modeling and large covariance matrix estimation. Compared to existing approaches, our method is scalable and can deal with much larger precision matrices with simple tuning. Under mild regularity conditions, we establish that this procedure can recover the underlying graphical structure with significant probability and provide efficient estimation of link strengths. Both computational and theoretical advantages of the procedure are evidenced through simulation and real data examples.
The AI system that can detect 85% of cyber attacks, with a little human help
MIT scientists have built a hybrid human/artificial intelligence (AI) machine that they claim can learn how to detect 85% of cyber attacks – that's roughly three times better than previous benchmarks – while reducing false positive rates by a factor of 5. Nitesh Chawla, professor of computer science at Notre Dame University, said in a statement from MIT that the machine "has the potential to become a line of defense against attacks such as fraud, service abuse and account takeover." Researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) and the machine-learning startup PatternEx demonstrated the platform, called AI2, in a paper titled "AI2: Training a big data machine to defend". As the researchers describe the current state of the art, today's security systems are typically driven by either humans – so-called "analyst-driven solutions" – or by machine. The problem with security systems based on fixed rules is that they miss attacks that don't match those rules. Machine-learning approaches, as the name suggests, rely on an adaptive process that can trigger annoying numbers of false positives.
A Selection of Giant Radio Sources from NVSS
Results of the application of pattern recognition techniques to the problem of identifying Giant Radio Sources (GRS) from the data in the NVSS catalog are presented and issues affecting the process are explored. Decision-tree pattern recognition software was applied to training set source pairs developed from known NVSS large angular size radio galaxies. The full training set consisted of 51,195 source pairs, 48 of which were known GRS for which each lobe was primarily represented by a single catalog component. The source pairs had a maximum separation of 20 arc minutes and a minimum component area of 1.87 square arc minutes at the 1.4 mJy level. The importance of comparing resulting probability distributions of the training and application sets for cases of unknown class ratio is demonstrated. The probability of correctly ranking a randomly selected (GRS, non-GRS) pair from the best of the tested classifiers was determined to be 97.8 +/- 1.5%. The best classifiers were applied to the over 870,000 candidate pairs from the entire catalog. Images of higher ranked sources were visually screened and a table of over sixteen hundred candidates, including morphological annotation, is presented. These systems include doubles and triples, Wide-Angle Tail (WAT) and Narrow-Angle Tail (NAT), S- or Z-shaped systems, and core-jets and resolved cores. While some resolved lobe systems are recovered with this technique, generally it is expected that such systems would require a different approach.
Introduction to Machine Learning with scikit-learn - Machine Learning Mastery
The scikit-learn library is one of the most popular platforms for everyday machine learning and data science. The reason is because it is built upon Python, a fully featured programming language. But how do you get started with machine learning with scikit-learn. Kevin Markham is a data science trainer who created a series of 9 videos that show you exactly how to get started in machine learning with scikit-learn. In this post you will discover this series of videos and exactly what is covered, step-by-step to help you decide if the material will be useful to you.
Reelin' and ROCin', Receiver Operating Characteristic by David Lettier
Imagine standing by a murky stream. You notice objects floating passed you. Pulling out your notebook, you write down for each object how confident you are that it is a fish (between 0.0 and 1.0). Not all are fish with some pieces being trash. Downstream, your friend scoops up each object and writes down what it actually is.