Goto

Collaborating Authors

 Computational Learning Theory


Tutorial Slides by Andrew Moore, computer scientist at Google, ex-CMU professor

@machinelearnbot

The Decision Tree is one of the most popular classification algorithms in current use in Data Mining and Machine Learning. This tutorial can be used as a self-contained introduction to the flavor and terminology of data mining without needing to review many statistical or probabilistic pre-requisites. If you're new to data mining you'll enjoy it, but your eyebrows will raise at how simple it all is! After having defined the job of classification, we explain how information gain (next Andrew Tutorial) can be used to find predictive input attributes. We show how applying this procedure recursively allows us to build a decision tree to predict future events.


What is the difference between Artificial Intelligence and Machine Learning?

#artificialintelligence

Machine learning is the subfield of computer science that, according to Arthur Samuel in 1959, gives "computers the ability to learn without being explicitly programmed."Evolved


AI Weekly: $102 million and machine learning's theory of general relativity

#artificialintelligence

That was the question last week as Element.ai raised this hefty sum in a Series A funding. Investors included Microsoft, Nvidia, and Intel Capital, all of whom have their own AI ambitions. Element aims to make AI easy for businesses to use by connecting them with machine learning experts. And while Element may have brought together competitors Microsoft and Intel, the rivals have been fiercely staking their own claims with a flurry of investments and acquisitions. While these moves point to AI one day becoming ubiquitous for business tasks, Google's release of an academic paper called "One Model to Learn Them All" shows another route for machine learning to become commonplace.


Boost SAT Solver with Hybrid Branching Heuristic

AAAI Conferences

Most state-of-the-art satisfiability (SA T) solvers are capable of solving large application instances with efficient branching heuristics. The VSIDS heuristic is widely used because of its robustness. This paper focuses on the inherent ties in VSIDS and proposes a new branching heuristic called TB-VSIDS, which attempts to break the ties with the consideration of the interplay between the branching heuristic and learned clauses. However, a branching heuristic cannot cover all problems, and its performance improves when combined with an appropriate configuration. Therefore, we also propose a hybrid model of branching heuristics based on random forest. The efficiencies of TBVSIDS and hybrid branching heuristics are evaluated on benchmarks in SA T Competitions. By constructing a model that reduces the overfitting problem, we hope to realize a hybrid branching heuristic that is widely applicable to other solvers.


Machine Learning the Future Class « Machine Learning (Theory)

#artificialintelligence

This spring, I taught a class on Machine Learning the Future at Cornell Tech covering a number of advanced topics in machine learning including online learning, joint (structured) prediction, active learning, contextual bandit learning, logarithmic time prediction, and parallel learning. Each of these classes was recorded from the laptop via Zoom and I just uploaded the recordings to Youtube. In some ways, this class is a followup to the large scale learning class I taught with Yann LeCun 4 years ago. The videos for that class were taken down(*) so these lectures both update and replace shared subjects as well as having some new subjects. Much of this material is fairly close to research so to assist other machine learning lecturers around the world in digesting the material, I've made all the source available as well.


The Mathematics of Machine Learning – Towards Data Science – Medium

#artificialintelligence

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I have observed that some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-learn, Weka, Tensorflow, R-caret etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results. What Level of Maths Do You Need?


Soft Weight-Sharing for Neural Network Compression

arXiv.org Machine Learning

The success of deep learning in numerous application domains created the de- sire to run and train them on mobile devices. This however, conflicts with their computationally, memory and energy intense nature, leading to a growing interest in compression. Recent work by Han et al. (2015a) propose a pipeline that involves retraining, pruning and quantization of neural network weights, obtaining state-of-the-art compression rates. In this paper, we show that competitive compression rates can be achieved by using a version of soft weight-sharing (Nowlan & Hinton, 1992). Our method achieves both quantization and pruning in one simple (re-)training procedure. This point of view also exposes the relation between compression and the minimum description length (MDL) principle.


A General Characterization of the Statistical Query Complexity

arXiv.org Machine Learning

Statistical query (SQ) algorithms are algorithms that have access to an {\em SQ oracle} for the input distribution $D$ instead of i.i.d.~ samples from $D$. Given a query function $\phi:X \rightarrow [-1,1]$, the oracle returns an estimate of ${\bf E}_{ x\sim D}[\phi(x)]$ within some tolerance $\tau_\phi$ that roughly corresponds to the number of samples. In this work we demonstrate that the complexity of solving general problems over distributions using SQ algorithms can be captured by a relatively simple notion of statistical dimension that we introduce. SQ algorithms capture a broad spectrum of algorithmic approaches used in theory and practice, most notably, convex optimization techniques. Hence our statistical dimension allows to investigate the power of a variety of algorithmic approaches by analyzing a single linear-algebraic parameter. Such characterizations were investigated over the past 20 years in learning theory but prior characterizations are restricted to the much simpler setting of classification problems relative to a fixed distribution on the domain (Blum et al., 1994; Bshouty and Feldman, 2002; Yang, 2001; Simon, 2007; Feldman, 2012; Szorenyi, 2009). Our characterization is also the first to precisely characterize the necessary tolerance of queries. We give applications of our techniques to two open problems in learning theory and to algorithms that are subject to memory and communication constraints.


On Fundamental Limits of Robust Learning

arXiv.org Machine Learning

We consider the problems of robust PAC learning from distributed and streaming data, which may contain malicious errors and outliers, and analyze their fundamental complexity questions. In particular, we establish lower bounds on the communication complexity for distributed robust learning performed on multiple machines, and on the space complexity for robust learning from streaming data on a single machine. These results demonstrate that gaining robustness of learning algorithms is usually at the expense of increased complexities. As far as we know, this work gives the first complexity results for distributed and online robust PAC learning.


Outline of machine learning - Wikipedia

#artificialintelligence

Machine learning – subfield of computer science[1] (more particularly soft computing) that evolved from the study of pattern recognition and computational learning theory in artificial intelligence.[1] In 1959, Arthur Samuel defined machine learning as a "Field of study that gives computers the ability to learn without being explicitly programmed".[2] Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.[3] Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.