AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

(Exhaustive) Symbolic Regression and model selection by minimum description length

Desmond, Harry

arXiv.org Artificial IntelligenceJul-18-2025

Symbolic regression is the machine learning method for learning functions from data. After a brief overview of the symbolic regression landscape, I will describe the two main challenges that traditional algorithms face: they have an unknown (and likely significant) probability of failing to find any given good function, and they suffer from ambiguity and poorly-justified assumptions in their function-selection procedure. To address these I propose an exhaustive search and model selection by the minimum description length principle, which allows accuracy and complexity to be directly traded off by measuring each in units of information. I showcase the resulting publicly available Exhaustive Symbolic Regression algorithm on three open problems in astrophysics: the expansion history of the universe, the effective behaviour of gravity in galaxies and the potential of the inflaton field. In each case the algorithm identifies many functions superior to the literature standards. This general purpose methodology should find widespread utility in science and beyond.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.13033

Genre:

Research Report (0.50)
Overview (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Computational-Statistical Tradeoffs from NP-hardness

Blanc, Guy, Koch, Caleb, Strassle, Carmen, Tan, Li-Yang

arXiv.org Artificial IntelligenceJul-18-2025

A central question in computer science and statistics is whether efficient algorithms can achieve the information-theoretic limits of statistical problems. Many computational-statistical tradeoffs have been shown under average-case assumptions, but since statistical problems are average-case in nature, it has been a challenge to base them on standard worst-case assumptions. In PAC learning where such tradeoffs were first studied, the question is whether computational efficiency can come at the cost of using more samples than information-theoretically necessary. We base such tradeoffs on $\mathsf{NP}$-hardness and obtain: $\circ$ Sharp computational-statistical tradeoffs assuming $\mathsf{NP}$ requires exponential time: For every polynomial $p(n)$, there is an $n$-variate class $C$ with VC dimension $1$ such that the sample complexity of time-efficiently learning $C$ is $Θ(p(n))$. $\circ$ A characterization of $\mathsf{RP}$ vs. $\mathsf{NP}$ in terms of learning: $\mathsf{RP} = \mathsf{NP}$ iff every $\mathsf{NP}$-enumerable class is learnable with $O(\mathrm{VCdim}(C))$ samples in polynomial time. The forward implication has been known since (Pitt and Valiant, 1988); we prove the reverse implication. Notably, all our lower bounds hold against improper learners. These are the first $\mathsf{NP}$-hardness results for improperly learning a subclass of polynomial-size circuits, circumventing formal barriers of Applebaum, Barak, and Xiao (2008).

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.13222

Genre: Research Report (1.00)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Universal rates of ERM for agnostic learning

Hanneke, Steve, Xu, Mingyue

arXiv.org Machine LearningJul-16-2025

The universal learning framework has been developed to obtain guarantees on the learning rates that hold for any fixed distribution, which can be much faster than the ones uniformly hold over all the distributions. Given that the Empirical Risk Minimization (ERM) principle being fundamental in the PAC theory and ubiquitous in practical machine learning, the recent work of arXiv:2412.02810 studied the universal rates of ERM for binary classification under the realizable setting. However, the assumption of realizability is too restrictive to hold in practice. Indeed, the majority of the literature on universal learning has focused on the realizable case, leaving the non-realizable case barely explored. In this paper, we consider the problem of universal learning by ERM for binary classification under the agnostic setting, where the ''learning curve" reflects the decay of the excess risk as the sample size increases. We explore the possibilities of agnostic universal rates and reveal a compact trichotomy: there are three possible agnostic universal rates of ERM, being either $e^{-n}$, $o(n^{-1/2})$, or arbitrarily slow. We provide a complete characterization of which concept classes fall into each of these categories. Moreover, we also establish complete characterizations for the target-dependent universal rates as well as the Bayes-dependent universal rates.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

2506.1411

Country:

North America > United States (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Finland (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

A Randomized Algorithm for Sparse PCA based on the Basic SDP Relaxation

Del Pia, Alberto, Zhou, Dekun

arXiv.org Machine LearningJul-15-2025

Sparse Principal Component Analysis (SPCA) is a fundamental technique for dimensionality reduction, and is NP-hard. In this paper, we introduce a randomized approximation algorithm for SPCA, which is based on the basic SDP relaxation. Our algorithm has an approximation ratio of at most the sparsity constant with high probability, if called enough times. Under a technical assumption, which is consistently satisfied in our numerical tests, the average approximation ratio is also bounded by $\mathcal{O}(\log{d})$, where $d$ is the number of features. We show that this technical assumption is satisfied if the SDP solution is low-rank, or has exponentially decaying eigenvalues. We then present a broad class of instances for which this technical assumption holds. We also demonstrate that in a covariance model, which generalizes the spiked Wishart model, our proposed algorithm achieves a near-optimal approximation ratio. We demonstrate the efficacy of our algorithm through numerical results on real-world datasets.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2507.09148

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

New Statistical and Computational Results for Learning Junta Distributions

Beretta, Lorenzo

arXiv.org Artificial IntelligenceJul-15-2025

We study the problem of learning junta distributions on $\{0, 1\}^n$, where a distribution is a $k$-junta if its probability mass function depends on a subset of at most $k$ variables. We make two main contributions: - We show that learning $k$-junta distributions is \emph{computationally} equivalent to learning $k$-parity functions with noise (LPN), a landmark problem in computational learning theory. - We design an algorithm for learning junta distributions whose statistical complexity is optimal, up to polylogarithmic factors. Computationally, our algorithm matches the complexity of previous (non-sample-optimal) algorithms. Combined, our two contributions imply that our algorithm cannot be significantly improved, statistically or computationally, barring a breakthrough for LPN.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.05819

Country: North America (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.69)

Add feedback

Advances in Machine Learning: Where Can Quantum Techniques Help?

Kashyap, Samarth, Ramakrishnan, Rohit K, Jyoti, Kumari, Patel, Apoorva D

arXiv.org Artificial IntelligenceJul-14-2025

Quantum Machine Learning (QML) represents a promising frontier at the intersection of quantum computing and artificial intelligence, aiming to leverage quantum computational advantages to enhance data-driven tasks. This review explores the potential of QML to address the computational bottlenecks of classical machine learning, particularly in processing complex datasets. We introduce the theoretical foundations of QML, including quantum data encoding, quantum learning theory and optimization techniques, while categorizing QML approaches based on data type and computational architecture. It is well-established that quantum computational advantages are problem-dependent, and so potentially useful directions for QML need to be systematically identified. Key developments, such as Quantum Principal Component Analysis, quantum-enhanced sensing and applications in material science, are critically evaluated for their theoretical speed-ups and practical limitations. The challenges posed by Noisy Intermediate-Scale Quantum (NISQ) devices, including hardware noise, scalability constraints and data encoding overheads, are discussed in detail. We also outline future directions, emphasizing the need for quantum-native algorithms, improved error correction, and realistic benchmarks to bridge the gap between theoretical promise and practical deployment. This comprehensive analysis underscores that while QML has significant potential for specific applications such as quantum chemistry and sensing, its broader utility in real-world scenarios remains contingent on overcoming technological and methodological hurdles.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2507.08379

Country: Asia > India (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Single-pass Adaptive Image Tokenization for Minimum Program Search

Duggal, Shivam, Byun, Sanghyun, Freeman, William T., Torralba, Antonio, Isola, Phillip

arXiv.org Artificial IntelligenceJul-11-2025

According to Algorithmic Information Theory (AIT) -- Intelligent representations compress data into the shortest possible program that can reconstruct its content, exhibiting low Kolmogorov Complexity (KC). In contrast, most visual representation learning systems use fixed-length representations for all inputs, ignoring variations in complexity or familiarity. Recent adaptive tokenization methods address this by allocating variable-length representations but typically require test-time search over multiple encodings to find the most predictive one. Inspired by Kolmogorov Complexity principles, we propose a single-pass adaptive tokenizer, KARL, which predicts the appropriate number of tokens for an image in a single forward pass, halting once its approximate KC is reached. The token count serves as a proxy for the minimum description length. KARL's training procedure closely resembles the Upside-Down Reinforcement Learning paradigm, as it learns to conditionally predict token halting based on a desired reconstruction quality. KARL matches the performance of recent adaptive tokenizers while operating in a single pass. We present scaling laws for KARL, analyzing the role of encoder/decoder size, continuous vs. discrete tokenization and more. Additionally, we offer a conceptual study drawing an analogy between Adaptive Image Tokenization and Algorithmic Information Theory, examining the predicted image complexity (KC) across axes such as structure vs. noise and in- vs. out-of-distribution familiarity -- revealing alignment with human intuition.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2507.07995

Country:

North America > United States > Massachusetts (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.77)

Add feedback

A Cryptographic Perspective on Mitigation vs. Detection in Machine Learning

Gluch, Greg, Goldwasser, Shafi

arXiv.org Artificial IntelligenceJul-11-2025

In this paper, we initiate a cryptographically inspired theoretical study of detection versus mitigation of adversarial inputs produced by attackers on Machine Learning algorithms during inference time. We formally define defense by detection (DbD) and defense by mitigation (DbM). Our definitions come in the form of a 3-round protocol between two resource-bounded parties: a trainer/defender and an attacker. The attacker aims to produce inference-time inputs that fool the training algorithm. We define correctness, completeness, and soundness properties to capture successful defense at inference time while not degrading (too much) the performance of the algorithm on inputs from the training distribution. We first show that achieving DbD and achieving DbM are equivalent for ML classification tasks. Surprisingly, this is not the case for ML generative learning tasks, where there are many possible correct outputs for each input. We show a separation between DbD and DbM by exhibiting two generative learning tasks for which it is possible to defend by mitigation but it is provably impossible to defend by detection. The mitigation phase uses significantly less computational resources than the initial training algorithm. In the first learning task we consider sample complexity as the resource and in the second the time complexity. The first result holds under the assumption that the Identity-Based Fully Homomorphic Encryption (IB-FHE), publicly-verifiable zero-knowledge Succinct Non-Interactive Arguments of Knowledge (zk-SNARK), and Strongly Unforgeable Signatures exist. The second result assumes the existence of Non-Parallelizing Languages with Average-Case Hardness (NPL) and Incrementally-Verifiable Computation (IVC) and IB-FHE.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2504.2031

Country:

North America > United States (0.28)
North America > Canada (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

High-Dimensional Learning in Finance

Fallahgoul, Hasan

arXiv.org Machine LearningJul-8-2025

Recent advances in machine learning have shown promising results for financial prediction using large, over-parameterized models. This paper provides theoretical foundations and empirical validation for understanding when and how these methods achieve predictive success. I examine two key aspects of high-dimensional learning in finance. First, I prove that within-sample standardization in Random Fourier Features implementations fundamentally alters the underlying Gaussian kernel approximation, replacing shift-invariant kernels with training-set dependent alternatives. Second, I establish information-theoretic lower bounds that identify when reliable learning is impossible no matter how sophisticated the estimator. A detailed quantitative calibration of the polynomial lower bound shows that with typical parameter choices, e.g., 12,000 features, 12 monthly observations, and R-square 2-3%, the required sample size to escape the bound exceeds 25-30 years of data--well beyond any rolling-window actually used. Thus, observed out-of-sample success must originate from lower-complexity artefacts rather than from the intended high-dimensional mechanism.

artificial intelligence, complexity, machine learning, (18 more...)

arXiv.org Machine Learning

2506.0378

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > Victoria (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

Add feedback

Consistency-Aware Padding for Incomplete Multi-Modal Alignment Clustering Based on Self-Repellent Greedy Anchor Search

Ma, Shubin, Zhao, Liang, Lu, Mingdong, Guo, Yifan, Xu, Bo

arXiv.org Artificial IntelligenceJul-8-2025

Multimodal representation is faithful and highly effective in describing real-world data samples' characteristics by describing their complementary information. However, the collected data often exhibits incomplete and misaligned characteristics due to factors such as inconsistent sensor frequencies and device malfunctions. Existing research has not effectively addressed the issue of filling missing data in scenarios where multiview data are both imbalanced and misaligned. Instead, it relies on class-level alignment of the available data. Thus, it results in some data samples not being well-matched, thereby affecting the quality of data fusion. In this paper, we propose the Consistency-A ware Padding for Incomplete Multimodal Alignment Clustering Based on Self-Repellent Greedy Anchor Search(CAPIMAC) to tackle the problem of filling imbalanced and mis-aligned data in multimodal datasets. Specifically, we propose a self-repellent greedy anchor search module(SRGASM), which employs a self-repellent random walk combined with a greedy algorithm to identify anchor points for re-representing incomplete and misaligned multimodal data. Subsequently, based on noise-contrastive learning, we design a consistency-aware padding module (CAPM) to effectively interpolate and align imbalanced and misaligned data, thereby improving the quality of multimodal data fusion. Experimental results demonstrate the superiority of our method over benchmark datasets.

artificial intelligence, data quality, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.03917

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.69)
(2 more...)

Add feedback