AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Data Science Learning Resources

@machinelearnbotMar-29-2016, 06:15:08 GMT

Very interesting collection of resources compiled by DistrictDataLabs, featuring books, online courses, articles across multiple categories: data science, probability and statistics, machine learning, R, Python, big data, DataViz, and NLP.

computer based training, data science learning resource, educational technology, (6 more...)

@machinelearnbot

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.51)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.31)

Add feedback

An Introduction to Machine Learning Theory and Its Applications: A Visual Tutorial with Examples

#artificialintelligenceMar-28-2016, 21:35:45 GMT

ML builds heavily on statistics. For example, when we train our machine to learn, we have to give it a statistically significant random sample as training data. If the training set is not random, we run the risk of the machine learning patterns that aren't actually there. And if the training set is too small (see law of large numbers), we won't learn enough and may even reach inaccurate conclusions. For example, attempting to predict company-wide satisfaction patterns based on data from upper management alone would likely be error-prone.

artificial intelligence, machine learning theory, visual tutorial, (2 more...)

#artificialintelligence

Genre: Instructional Material (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.85)

Add feedback

How To Become A Machine Learning Expert In One Simple Step -- Swan Intelligence

#artificialintelligenceMar-22-2016, 02:23:28 GMT

The web is full of good explanations of machine learning algorithms. And every second applicant for a data science position has finished the Coursera course on machine learning. Theory will not help you choose good values for the 16 parameters a standard implementation of a random forest takes. The default values are good to get started, but which parameters should you modify depending on your data? Choosing the right features, algorithms and parameters is an art.

artificial intelligence, leaderboard, machine learning expert, (9 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.36)

Industry: Education (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.53)

Add feedback

An Introduction to Machine Learning Theory and Its Applications: A Visual Tutorial with Examples

#artificialintelligenceMar-21-2016, 20:11:37 GMT

Machine Learning (ML) is coming into its own, with a growing recognition that ML can play a key role in a wide range of critical applications, such as data mining, natural language processing, image recognition, and expert systems. ML provides potential solutions in all these domains and more, and is set to be a pillar of our future civilization. The supply of able ML designers has yet to catch up to this demand. A major reason for this is that ML is just plain tricky. This tutorial introduces the basics of Machine Learning theory, laying down the common themes and concepts, making it easy to follow the logic and get comfortable with the topic. So what exactly is "machine learning" anyway?

artificial intelligence, banking & finance, predictor, (18 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.89)

Industry: Banking & Finance (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Keeping it Short and Simple: Summarising Complex Event Sequences with Multivariate Patterns

Bertens, Roel, Vreeken, Jilles, Siebes, Arno

arXiv.org Artificial IntelligenceFeb-10-2016

We study how to obtain concise descriptions of discrete multivariate sequential data. In particular, how to do so in terms of rich multivariate sequential patterns that can capture potentially highly interesting (cor)relations between sequences. To this end we allow our pattern language to span over the domains (alphabets) of all sequences, allow patterns to overlap temporally, as well as allow for gaps in their occurrences. We formalise our goal by the Minimum Description Length principle, by which our objective is to discover the set of patterns that provides the most succinct description of the data. To discover high-quality pattern sets directly from data, we introduce DITTO, a highly efficient algorithm that approximates the ideal result very well. Experiments show that DITTO correctly discovers the patterns planted in synthetic data. Moreover, it scales favourably with the length of the data, the number of attributes, the alphabet sizes. On real data, ranging from sensor networks to annotated text, DITTO discovers easily interpretable summaries that provide clear insight in both the univariate and multivariate structure.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1512.07056

Country:

North America > United States (0.28)
Europe > Germany > Saarland (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.34)

Add feedback

Minimax Lower Bounds for Realizable Transductive Classification

Tolstikhin, Ilya, Lopez-Paz, David

arXiv.org Machine LearningFeb-9-2016

Transductive learning considers a training set of $m$ labeled samples and a test set of $u$ unlabeled samples, with the goal of best labeling that particular test set. Conversely, inductive learning considers a training set of $m$ labeled samples drawn iid from $P(X,Y)$, with the goal of best labeling any future samples drawn iid from $P(X)$. This comparison suggests that transduction is a much easier type of inference than induction, but is this really the case? This paper provides a negative answer to this question, by proving the first known minimax lower bounds for transductive, realizable, binary classification. Our lower bounds show that $m$ should be at least $\Omega(d/\epsilon + \log(1/\delta)/\epsilon)$ when $\epsilon$-learning a concept class $\mathcal{H}$ of finite VC-dimension $d<\infty$ with confidence $1-\delta$, for all $m \leq u$. This result draws three important conclusions. First, general transduction is as hard as general induction, since both problems have $\Omega(d/m)$ minimax values. Second, the use of unlabeled data does not help general transduction, since supervised learning algorithms such as ERM and (Hanneke, 2015) match our transductive lower bounds while ignoring the unlabeled test set. Third, our transductive lower bounds imply lower bounds for semi-supervised learning, which add to the important discussion about the role of unlabeled data in machine learning.

artificial intelligence, inductive learning, inequality, (14 more...)

arXiv.org Machine Learning

1602.03027

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.84)

Add feedback

The Optimal Sample Complexity of PAC Learning

Hanneke, Steve

arXiv.org Machine LearningFeb-7-2016

This work establishes a new upper bound on the number of samples sufficient for PAC learning in the realizable case. The bound matches known lower bounds up to numerical constant factors. This solves a long-standing open problem on the sample complexity of PAC learning. The technique and analysis build on a recent breakthrough by Hans Simon.

artificial intelligence, machine learning, probability, (14 more...)

arXiv.org Machine Learning

1507.00473

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

On the Pseudo-Dimension of Nearly Optimal Auctions

Morgenstern, Jamie H., Roughgarden, Tim

Neural Information Processing SystemsDec-31-2015

This paper develops a general approach, rooted in statistical learning theory, to learning an approximately revenue-maximizing auction from data. We introduce t-level auctions to interpolate between simple auctions, such as welfare maximization with reserve prices, and optimal auctions, thereby balancing the competing demands of expressivity and simplicity. We prove that such auctions have small representation error, in the sense that for every product distribution F over bidders’ valuations, there exists a t-level auction with small t and expected revenue close to optimal. We show that the set of t-level auctions has modest pseudo-dimension (for polynomial t) and therefore leads to small learning error. One consequence of our results is that, in arbitrary single-parameter settings, one can learn a mechanism with expected revenue arbitrarily close to optimal from a polynomial number of samples.

artificial intelligence, auction, game theory, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.14)

Genre: Research Report (0.35)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.50)

Add feedback

Sum-of-Squares Lower Bounds for Sparse PCA

Ma, Tengyu, Wigderson, Avi

Neural Information Processing SystemsDec-31-2015

This paper establishes a statistical versus computational trade-offfor solving a basic high-dimensional machine learning problem via a basic convex relaxation method. Specifically, we consider the {\em Sparse Principal Component Analysis} (Sparse PCA) problem, and the family of {\em Sum-of-Squares} (SoS, aka Lasserre/Parillo) convex relaxations. It was well known that in large dimension $p$, a planted $k$-sparse unit vector can be {\em in principle} detected using only $n \approx k\log p$ (Gaussian or Bernoulli) samples, but all {\em efficient} (polynomial time) algorithms known require $n \approx k^2 $ samples. It was also known that this quadratic gap cannot be improved by the the most basic {\em semi-definite} (SDP, aka spectral) relaxation, equivalent to a degree-2 SoS algorithms. Here we prove that also degree-4 SoS algorithms cannot improve this quadratic gap. This average-case lower bound adds to the small collection of hardness results in machine learning for this powerful family of convex relaxation algorithms. Moreover, our design of moments (or ``pseudo-expectations'') for this lower bound is quite different than previous lower bounds. Establishing lower bounds for higher degree SoS algorithms for remains a challenging problem.

algorithm, artificial intelligence, optimization problem, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.14)
North America > Canada > Quebec (0.14)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

Spectral Norm Regularization of Orthonormal Representations for Graph Transduction

Shivanna, Rakesh, Chatterjee, Bibaswan K., Sankaran, Raman, Bhattacharyya, Chiranjib, Bach, Francis

Neural Information Processing SystemsDec-31-2015

Recent literature~\cite{ando} suggests that embedding a graph on an unit sphere leads to better generalization for graph transduction. However, the choice of optimal embedding and an efficient algorithm to compute the same remains open. In this paper, we show that orthonormal representations, a class of unit-sphere graph embeddings are PAC learnable. Existing PAC-based analysis do not apply as the VC dimension of the function class is infinite. We propose an alternative PAC-based bound, which do not depend on the VC dimension of the underlying function class, but is related to the famous Lov\'{a}sz~$\vartheta$ function. The main contribution of the paper is SPORE, a SPectral regularized ORthonormal Embedding for graph transduction, derived from the PAC bound. SPORE is posed as a non-smooth convex function over an \emph{elliptope}. These problems are usually solved as semi-definite programs (SDPs) with time complexity $O(n^6)$. We present, Infeasible Inexact proximal~(IIP): an Inexact proximal method which performs subgradient procedure on an approximate projection, not necessarily feasible. IIP is more scalable than SDP, has an $O(\frac{1}{\sqrt{T}})$ convergence, and is generally applicable whenever a suitable approximate projection is available. We use IIP to compute SPORE where the approximate projection step is computed by FISTA, an accelerated gradient descent procedure. We show that the method has a convergence rate of $O(\frac{1}{\sqrt{T}})$. The proposed algorithm easily scales to 1000's of vertices, while the standard SDP computation does not scale beyond few hundred vertices. Furthermore, the analysis presented here easily extends to the multiple graph setting.

artificial intelligence, graph, health & medicine, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.54)

Add feedback