AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

The Pseudo-Dimension of Near-Optimal Auctions

Neural Information Processing SystemsMar-13-2024, 05:31:23 GMT

This paper develops a general approach, rooted in statistical learning theory, to learning an approximately revenue-maximizing auction from data. We introduce t-level auctions to interpolate between simple auctions, such as welfare maximization with reserve prices, and optimal auctions, thereby balancing the competing demands of expressivity and simplicity. We prove that such auctions have small representation error, in the sense that for every product distribution F over bidders' valuations, there exists a t-level auction with small t and expected revenue close to optimal. We show that the set of t-level auctions has modest pseudodimension (for polynomial t) and therefore leads to small learning error. One consequence of our results is that, in arbitrary single-parameter settings, one can learn a mechanism with expected revenue arbitrarily close to optimal from a polynomial number of samples.

auction, bidder, t-level auction, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(8 more...)

Genre: Research Report (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.50)

Add feedback

Sum-of-Squares Lower Bounds for Sparse PCA Tengyu Ma1 Department of Computer Science, Princeton University

Neural Information Processing SystemsMar-13-2024, 04:46:10 GMT

This paper establishes a statistical versus computational trade-off for solving a basic high-dimensional machine learning problem via a basic convex relaxation method. Specifically, we consider the Sparse Principal Component Analysis (Sparse PCA) problem, and the family of Sum-of-Squares (SoS, aka Lasserre/Parillo) convex relaxations.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Nevada (0.04)
(7 more...)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Learning Theory and Algorithms for Forecasting Non-Stationary Time Series

Neural Information Processing SystemsMar-12-2024, 22:58:13 GMT

Our learning guarantees are expressed in terms of a data-dependent measure of sequential complexity and a discrepancy measure that can be estimated from data under some mild assumptions. We use our learning bounds to devise new algorithms for non-stationary time series forecasting for which we report some preliminary experimental results.

algorithm, assumption, generalization, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > United Kingdom (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.64)

Add feedback

Spectral Norm Regularization of Orthonormal Representations for Graph Transduction

Neural Information Processing SystemsMar-12-2024, 21:27:31 GMT

Recent literature [1] suggests that embedding a graph on an unit sphere leads to better generalization for graph transduction. However, the choice of optimal embedding and an efficient algorithm to compute the same remains open. In this paper, we show that orthonormal representations, a class of unit-sphere graph embeddings are PAC learnable. Existing PAC-based analysis do not apply as the VC dimension of the function class is infinite. We propose an alternative PAC-based bound, which do not depend on the VC dimension of the underlying function class, but is related to the famous Lovász ϑ function. The main contribution of the paper is SPORE, a SPectral regularized ORthonormal Embedding for graph transduction, derived from the PAC bound. SPORE is posed as a non-smooth convex function over an elliptope.

algorithm, graph, supplementary material, (13 more...)

Neural Information Processing Systems

Country:

Asia > India > Karnataka > Bengaluru (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.54)

Add feedback

Mistake Bounds for Binary Matrix Completion Mark Herbster

Neural Information Processing SystemsMar-12-2024, 17:59:09 GMT

We study the problem of completing a binary matrix in an online learning setting. On each trial we predict a matrix entry and then receive the true entry. We propose a Matrix Exponentiated Gradient algorithm [1] to solve this problem. We provide a mistake bound for the algorithm, which scales with the margin complexity [2, 3] of the underlying matrix. The bound suggests an interpretation where each row of the matrix is a prediction task over a finite set of objects, the columns. Using this we show that the algorithm makes a number of mistakes which is comparable up to a logarithmic factor to the number of mistakes made by the Kernel Perceptron with an optimal kernel in hindsight. We discuss applications of the algorithm to predicting as well as the best biclustering and to the problem of predicting the labeling of a graph without knowing the graph in advance.

algorithm, conference, matrix, (12 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom (0.04)
North America > United States > New York (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Italy > Liguria > Genoa (0.04)

Industry:

Government (0.47)
Education > Educational Setting > Online (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.36)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.34)

Add feedback

Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models Marc Vuffray

Neural Information Processing SystemsMar-12-2024, 14:00:21 GMT

We consider the problem of learning the underlying graph of an unknown Ising model on p spins from a collection of i.i.d.

coupling, estimator, ising model, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.05)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)

Industry:

Energy (0.47)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

On the Recursive Teaching Dimension of VC Classes Xi Chen

Neural Information Processing SystemsMar-12-2024, 12:16:44 GMT

In this paper, we study the quantitative relation between RTD and the well-known learning complexity measure VC dimension (VCD), and improve the best known upper and (worst-case) lower bounds on the recursive teaching dimension with respect to the VC dimension.

dimension, rtd, vcd, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Multi-step learning and underlying structure in statistical models

Neural Information Processing SystemsMar-12-2024, 10:15:17 GMT

In multi-step learning, where a final learning task is accomplished via a sequence of intermediate learning tasks, the intuition is that successive steps or levels transform the initial data into representations more and more "suited" to the final learning task. A related principle arises in transfer-learning where Baxter (2000) proposed a theoretical framework to study how learning multiple tasks transforms the inductive bias of a learner. The most widespread multi-step learning approach is semisupervised learning with two steps: unsupervised, then supervised. Several authors (Castelli-Cover, 1996; Balcan-Blum, 2005; Niyogi, 2008; Ben-David et al, 2008; Urner et al, 2011) have analyzed SSL, with Balcan-Blum (2005) proposing a version of the PAC learning framework augmented by a "compatibility function" to link concept class and unlabeled data distribution. We propose to analyze SSL and other multi-step learning approaches, much in the spirit of Baxter's framework, by defining a learning problem generatively as a joint statistical model on X Y.

algorithm, compatibility function, learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Majority-of-Three: The Simplest Optimal Learner?

Aden-Ali, Ishaq, Høgsgaard, Mikael Møller, Larsen, Kasper Green, Zhivotovskiy, Nikita

arXiv.org Machine LearningMar-12-2024

Developing an optimal PAC learning algorithm in the realizable setting, where empirical risk minimization (ERM) is suboptimal, was a major open problem in learning theory for decades. The problem was finally resolved by Hanneke a few years ago. Unfortunately, Hanneke's algorithm is quite complex as it returns the majority vote of many ERM classifiers that are trained on carefully selected subsets of the data. It is thus a natural goal to determine the simplest algorithm that is optimal. In this work we study the arguably simplest algorithm that could be optimal: returning the majority vote of three ERM classifiers. We show that this algorithm achieves the optimal in-expectation bound on its error which is provably unattainable by a single ERM classifier. Furthermore, we prove a near-optimal high-probability bound on this algorithm's error. We conjecture that a better analysis will prove that this algorithm is in fact optimal in the high-probability regime.

algorithm, lemma 4, probability, (15 more...)

arXiv.org Machine Learning

2403.08831

Country: Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces Microsoft Research La Jolla, CA

Neural Information Processing SystemsMar-11-2024, 20:55:10 GMT

It has been a long-standing problem to efficiently learn a halfspace using as few labels as possible in the presence of noise. In this work, we propose an efficient Perceptron-based algorithm for actively learning homogeneous halfspaces under the uniform distribution over the unit sphere.

active learning, algorithm, learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > La Jolla (0.40)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada > Quebec > Montreal (0.04)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.71)

Add feedback