AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Reviews: Globally Optimal Learning for Structured Elliptical Losses

Neural Information Processing SystemsJan-22-2025, 02:25:24 GMT

All reviewers agreed that this paper makes an interesting contribution to NeurIPS. Please make sure to take the reviewers' comments in consideration for the camera-ready version, in particular the contextualization of the work.

globally optimal learning, reviewer, structured elliptical loss

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

Review for NeurIPS paper: Towards Problem-dependent Optimal Learning Rates

Neural Information Processing SystemsJan-22-2025, 02:00:58 GMT

Clarity: The paper is easy to read, despite being a theoretical work. The authors introduce all of the key concepts and make the manuscript (relatively) self-contained (given the format they do a good job making the paper accessible). However, there are a lot of grammar mistakes/typos, so the whole manuscript has to be very carefully checked.

manuscript, neurips paper, problem-dependent optimal learning rate

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

Review for NeurIPS paper: Towards Problem-dependent Optimal Learning Rates

Neural Information Processing SystemsJan-22-2025, 02:00:51 GMT

The reviewers agree that this is an exciting and interesting paper which improves the best-known variance-dependent rates for statistical learning with nonparametric classes, and are all in favor of accepting. I hope the authors will pay attention to the typos and clarifications pointed about by the reviewers and address these in the final version of the paper. As reviewer 4 and the authors' response mention, the point about removing the \log(n) factor about VC classes is subtle, and this paper does not really remove this term unless we make specific assumptions on the value of V*. I would recommend the authors either expand the discussion about this and include a more detailed comparison with prior work, or minimize this claim.

neurips paper, problem-dependent optimal learning rate

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

The regret lower bound for communicating Markov Decision Processes

Boone, Victor, Maillard, Odalric-Ambrym

arXiv.org Machine LearningJan-22-2025

This paper is devoted to the extension of the regret lower bound beyond ergodic Markov decision processes (MDPs) in the problem dependent setting. While the regret lower bound for ergodic MDPs is well-known and reached by tractable algorithms, we prove that the regret lower bound becomes significatively more complex in communicating MDPs. Our lower bound revisits the necessary explorative behavior of consistent learning agents and further explains that all optimal regions of the environment must be overvisited compared to sub-optimal ones, a phenomenon that we refer to as co-exploration. In tandem, we show that these two explorative and co-explorative behaviors are intertwined with navigation constraints obtained by scrutinizing the navigation structure at logarithmic scale. The resulting lower bound is expressed as the solution of an optimization problem that, in many standard classes of MDPs, can be specialized to recover existing results. From a computational perspective, it is provably $\Sigma_2^\textrm{P}$-hard in general and as a matter of fact, even testing the membership to the feasible region is coNP-hard. We further provide an algorithm to approximate the lower bound in a constructive way.

artificial intelligence, machine learning, markov decision process, (17 more...)

arXiv.org Machine Learning

2501.13013

Country: Europe > France (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)

Add feedback

Reviews: Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models

Neural Information Processing SystemsJan-20-2025, 15:05:06 GMT

The main effort is to improve upon the recent results provided by Bresler, showing that the complexity of identifying the structure of max degree d Ising model is polynomial in p and independent of d. Strong Points: 1) The timeliness of the topic in this paper is good, meaning that there is currently ongoing interest and work on Ising model reconstruction. Weak points: 1) The whole approach is based on the introduction of the ISO. This is the main trick in the proposed approach. Other than that, the rest of the method and its analysis are usual and well studied (l_1-penalization and connection with the tutorial by Negahban et.

efficient and sample-optimal learning, interaction screening, ising model, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

PAC Learning Linear Thresholds from Label Proportions

Neural Information Processing SystemsJan-19-2025, 23:12:48 GMT

Learning from label proportions (LLP) is a generalization of supervised learning in which the training data is available as sets or bags of feature-vectors (instances) along with the average instance-label of each bag. The goal is to train a good instance classifier. While most previous works on LLP have focused on training models on such training data, computational learnability of LLP was onlyrecently explored by Saket (2021, 2022) who showed worst case intractability of properly learning linear threshold functions (LTFs) from label proportions. However, their work did not rule out efficient algorithms for this problem for natural distributions.In this work we show that it is indeed possible to efficiently learn LTFs using LTFs when given access to random bags of some label proportion in which feature-vectors are, conditioned on their labels, independently sampled from a Gaussian distribution N(µ, Σ) . Our work shows that a certain matrix – formed using covariances of the differences of feature-vectors sampled from the bags with and without replacement – necessarily has its principal component, after a transformation, in the direction of the normal vector of the LTF.

label proportion, ltf, pac learning linear threshold, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

Optimal Learners for Realizable Regression: PAC Learning and Online Learning

Neural Information Processing SystemsJan-19-2025, 14:07:07 GMT

In this work, we aim to characterize the statistical complexity of realizable regression both in the PAC learning setting and the online learning setting. Previous work had established the sufficiency of finiteness of the fat shattering dimension for PAC learnability and the necessity of finiteness of the scaled Natarajan dimension, but little progress had been made towards a more complete characterization since the work of Simon 1997 (SICOMP '97). To this end, we first introduce a minimax instance optimal learner for realizable regression and propose a novel dimension that both qualitatively and quantitatively characterizes which classes of real-valued predictors are learnable. We then identify a combinatorial dimension related to the graph dimension that characterizes ERM learnability in the realizable setting. Finally, we establish a necessary condition for learnability based on a combinatorial dimension related to the DS dimension, and conjecture that it may also be sufficient in this context.

dimension, pac learning and online learning, realizable regression, (4 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.69)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.65)

Add feedback

Training Neural Networks is NP-Hard in Fixed Dimension

Neural Information Processing SystemsJan-19-2025, 13:38:39 GMT

We study the parameterized complexity of training two-layer neural networks with respect to the dimension of the input data and the number of hidden neurons, considering ReLU and linear threshold activation functions. Albeit the computational complexity of these problems has been studied numerous times in recent years, several questions are still open. We answer questions by Arora et al. (ICLR 2018) and Khalife and Basu (IPCO 2022) showing that both problems are NP-hard for two dimensions, which excludes any polynomial-time algorithm for constant dimension. We also answer a question by Froese et al. (JAIR 2022) proving W[1]-hardness for four ReLUs (or two linear threshold neurons) with zero training error. Finally, in the ReLU case, we show fixed-parameter tractability for the combined parameter number of dimensions and number of ReLUs if the network is assumed to compute a convex map.

fixed dimension, np-hard, training neural network, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)

Add feedback

Is Out-of-Distribution Detection Learnable?

Neural Information Processing SystemsJan-19-2025, 06:51:59 GMT

Supervised learning aims to train a classifier under the assumption that training and test data are from the same distribution. To ease the above assumption, researchers have studied a more realistic setting: out-of-distribution (OOD) detection, where test data may come from classes that are unknown during training (i.e., OOD data). Due to the unavailability and diversity of OOD data, good generalization ability is crucial for effective OOD detection algorithms. To study the generalization of OOD detection, in this paper, we investigate the probably approximately correct (PAC) learning theory of OOD detection, which is proposed by researchers as an open problem. First, we find a necessary condition for the learnability of OOD detection.

detection, ood detection, out-of-distribution detection learnable, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.62)

Add feedback

Towards Practical Few-shot Query Sets: Transductive Minimum Description Length Inference

Neural Information Processing SystemsJan-19-2025, 02:50:51 GMT

Standard few-shot benchmarks are often built upon simplifying assumptions on the query sets, which may not always hold in practice. In particular, for each task at testing time, the classes effectively present in the unlabeled query set are known a priori, and correspond exactly to the set of classes represented in the labeled support set. We relax these assumptions and extend current benchmarks, so that the query-set classes of a given task are unknown, but just belong to a much larger set of possible classes. Our setting could be viewed as an instance of the challenging yet practical problem of extremely imbalanced K -way classification, K being much larger than the values typically used in standard benchmarks, and with potentially irrelevant supervision from the support set. Motivated by these observations, we introduce a \textbf{P}rim\textbf{A}l \textbf{D}ual Minimum \textbf{D}escription \textbf{LE}ngth (\textbf{PADDLE}) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task, under supervision constraints from the support set.

practical few-shot query set, textbf, transductive minimum description length inference, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.40)

Add feedback