AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Testing classical properties from quantum data

Caro, Matthias C., Naik, Preksha, Slote, Joseph

arXiv.org Artificial IntelligenceNov-19-2024

Many properties of Boolean functions can be tested far more efficiently than the function can be learned. However, this advantage often disappears when testers are limited to random samples--a natural setting for data science--rather than queries. In this work we investigate the quantum version of this scenario: quantum algorithms that test properties of a function $f$ solely from quantum data in the form of copies of the function state for $f$. For three well-established properties, we show that the speedup lost when restricting classical testers to samples can be recovered by testers that use quantum data. For monotonicity testing, we give a quantum algorithm that uses $\tilde{\mathcal{O}}(n^2)$ function state copies as compared to the $2^{\Omega(\sqrt{n})}$ samples required classically. We also present $\mathcal{O}(1)$-copy testers for symmetry and triangle-freeness, comparing favorably to classical lower bounds of $\Omega(n^{1/4})$ and $\Omega(n)$ samples respectively. These algorithms are time-efficient and necessarily include techniques beyond the Fourier sampling approaches applied to earlier testing problems. These results make the case for a general study of the advantages afforded by quantum data for testing. We contribute to this project by complementing our upper bounds with a lower bound of $\Omega(1/\varepsilon)$ for monotonicity testing from quantum data in the proximity regime $\varepsilon\leq\mathcal{O}(n^{-3/2})$. This implies a strict separation between testing monotonicity from quantum data and from quantum queries--where $\tilde{\mathcal{O}}(n)$ queries suffice when $\varepsilon=\Theta(n^{-3/2})$. We also exhibit a testing problem that can be solved from $\mathcal{O}(1)$ classical queries but requires $\Omega(2^{n/2})$ function state copies, complementing a separation of the same magnitude in the opposite direction derived from the Forrelation problem.

artificial intelligence, machine learning, property testing, (18 more...)

arXiv.org Artificial Intelligence

2411.1273

Country:

Europe (0.92)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

On Reductions and Representations of Learning Problems in Euclidean Spaces

Chornomaz, Bogdan, Moran, Shay, Waknine, Tom

arXiv.org Machine LearningNov-16-2024

Many practical prediction algorithms represent inputs in Euclidean space and replace the discrete 0/1 classification loss with a real-valued surrogate loss, effectively reducing classification tasks to stochastic optimization. In this paper, we investigate the expressivity of such reductions in terms of key resources, including dimension and the role of randomness. We establish bounds on the minimum Euclidean dimension $D$ needed to reduce a concept class with VC dimension $d$ to a Stochastic Convex Optimization (SCO) problem in $\mathbb{R}^D$, formally addressing the intuitive interpretation of the VC dimension as the number of parameters needed to learn the class. To achieve this, we develop a generalization of the Borsuk-Ulam Theorem that combines the classical topological approach with convexity considerations. Perhaps surprisingly, we show that, in some cases, the number of parameters $D$ must be exponentially larger than the VC dimension $d$, even if the reduction is only slightly non-trivial. We also present natural classification tasks that can be represented in much smaller dimensions by leveraging randomness, as seen in techniques like random initialization. This result resolves an open question posed by Kamath, Montasser, and Srebro (COLT 2020). Our findings introduce new variants of \emph{dimension complexity} (also known as \emph{sign-rank}), a well-studied parameter in learning and complexity theory. Specifically, we define an approximate version of sign-rank and another variant that captures the minimum dimension required for a reduction to SCO. We also propose several open questions and directions for future research.

artificial intelligence, machine learning, reduction, (16 more...)

arXiv.org Machine Learning

2411.10784

Country:

Europe (0.46)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Education > Focused Education > Special Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

The Limits of Differential Privacy in Online Learning

Li, Bo, Wang, Wei, Ye, Peng

arXiv.org Artificial IntelligenceNov-8-2024

Differential privacy (DP) is a formal notion that restricts the privacy leakage of an algorithm when running on sensitive data, in which privacy-utility trade-off is one of the central problems in private data analysis. In this work, we investigate the fundamental limits of differential privacy in online learning algorithms and present evidence that separates three types of constraints: no DP, pure DP, and approximate DP. We first describe a hypothesis class that is online learnable under approximate DP but not online learnable under pure DP under the adaptive adversarial setting. This indicates that approximate DP must be adopted when dealing with adaptive adversaries. We then prove that any private online learner must make an infinite number of mistakes for almost all hypothesis classes. This essentially generalizes previous results and shows a strong separation between private and non-private settings since a finite mistake bound is always attainable (as long as the class is online learnable) when there is no privacy requirement.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2411.05483

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.64)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length

Yu, Zihan, Ding, Jingtao, Li, Yong

arXiv.org Artificial IntelligenceNov-6-2024

Symbolic regression, a task discovering the formula best fitting the given data, is typically based on the heuristical search. These methods usually update candidate formulas to obtain new ones with lower prediction errors iteratively. However, since formulas with similar function shapes may have completely different symbolic forms, the prediction error does not decrease monotonously as the search approaches the target formula, causing the low recovery rate of existing methods. To solve this problem, we propose a novel search objective based on the minimum description length, which reflects the distance from the target and decreases monotonically as the search approaches the correct form of the target formula. To estimate the minimum description length of any input data, we design a neural network, MDLformer, which enables robust and scalable estimation through large-scale training. With the MDLformer's output as the search objective, we implement a symbolic regression method, SR4MDL, that can effectively recover the correct mathematical form of the formula. Extensive experiments illustrate its excellent performance in recovering formulas from data. Our method successfully recovers around 50 formulas across two benchmark datasets comprising 133 problems, outperforming state-of-the-art methods by 43.92%.

evolutionary algorithm, formula, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.03753

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)
(2 more...)

Add feedback

Sample-Efficient Agnostic Boosting

Ghai, Udaya, Singh, Karan

arXiv.org Machine LearningOct-31-2024

The theory of boosting provides a computational framework for aggregating approximate weak learning algorithms, which perform marginally better than a random predictor, into an accurate strong learner. In the realizable case, the success of the boosting approach is underscored by a remarkable fact that the resultant sample complexity matches that of a computationally demanding alternative, namely Empirical Risk Minimization (ERM). This in particular implies that the realizable boosting methodology has the potential to offer computational relief without compromising on sample efficiency. Despite recent progress, in agnostic boosting, where assumptions on the conditional distribution of labels given feature descriptions are absent, ERM outstrips the agnostic boosting methodology in being quadratically more sample efficient than all known agnostic boosting algorithms. In this paper, we make progress on closing this gap, and give a substantially more sample efficient agnostic boosting algorithm than those known, without compromising on the computational (or oracle) complexity. A key feature of our algorithm is that it leverages the ability to reuse samples across multiple rounds of boosting, while guaranteeing a generalization error strictly better than those obtained by blackbox applications of uniform convergence arguments. We also apply our approach to other previously studied learning problems, including boosting for reinforcement learning, and demonstrate improved results.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2410.23632

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.50)

Industry:

Education (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Prospective Learning: Learning for a Dynamic Future

De Silva, Ashwin, Ramesh, Rahul, Yang, Rubing, Yu, Siyu, Vogelstein, Joshua T, Chaudhari, Pratik

arXiv.org Machine LearningOct-31-2024

In real-world applications, the distribution of the data, and our goals, evolve over time. The prevailing theoretical framework for studying machine learning, namely probably approximately correct (PAC) learning, largely ignores time. As a consequence, existing strategies to address the dynamic nature of data and goals exhibit poor real-world performance. This paper develops a theoretical framework called "Prospective Learning" that is tailored for situations when the optimal hypothesis changes over time. In PAC learning, empirical risk minimization (ERM) is known to be consistent. We develop a learner called Prospective ERM, which returns a sequence of predictors that make predictions on future data. We prove that the risk of prospective ERM converges to the Bayes risk under certain assumptions on the stochastic process generating the data. Prospective ERM, roughly speaking, incorporates time as an input in addition to the data. We show that standard ERM as done in PAC learning, without incorporating time, can result in failure to learn when distributions are dynamic. Numerical experiments illustrate that prospective ERM can learn synthetic and visual recognition problems constructed from MNIST and CIFAR-10.

large language model, learner, machine learning, (18 more...)

arXiv.org Machine Learning

2411.00109

Country:

North America > United States (0.67)
Europe > United Kingdom > England (0.45)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Transformation-Invariant Learning and Theoretical Guarantees for OOD Generalization

Montasser, Omar, Shao, Han, Abbe, Emmanuel

arXiv.org Artificial IntelligenceOct-30-2024

Learning with identical train and test distributions has been extensively investigated both practically and theoretically. Much remains to be understood, however, in statistical learning under distribution shifts. This paper focuses on a distribution shift setting where train and test distributions can be related by classes of (data) transformation maps. We initiate a theoretical study for this framework, investigating learning scenarios where the target class of transformations is either known or unknown. We establish learning rules and algorithmic reductions to Empirical Risk Minimization (ERM), accompanied with learning guarantees. We obtain upper bounds on the sample complexity in terms of the VC dimension of the class composing predictors with transformations, which we show in many cases is not much larger than the VC dimension of the class of predictors. We highlight that the learning rules we derive offer a game-theoretic viewpoint on distribution shift: a learner searching for predictors and an adversary searching for transformation maps to respectively minimize and maximize the worst-case loss.

artificial intelligence, machine learning, transformation, (14 more...)

arXiv.org Artificial Intelligence

2410.23461

Country:

Europe (1.00)
North America > United States > Hawaii (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.88)

Add feedback

Enhancing PAC Learning of Half spaces Through Robust Optimization Techniques

Tavangari, Shirmohammad, Shakarami, Zahra, Yelghi, Aref, Yelghi, Asef

arXiv.org Artificial IntelligenceOct-28-2024

This paper explores the challenges of PAC learning in semi-enclosed environments that face persistent disruptive noise and demonstrates the weaknesses of traditional learning models based on noise-free data. We present a novel algorithm that enhances noise robustness in semiconservative learning by using robust optimization techniques and advanced error correction methods and improves learning accuracy without adding additional computational cost. We also prove that this algorithm is very resistant to hostile noises. Experimental results on various datasets demonstrate its effectiveness. They provide a scalable solution for increasing the reliability of machine learning in noisy environments which contributes to noise-resilient learning and increased confidence in ML applications.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.16573

Country:

Europe (0.69)
Asia > Middle East > Republic of Türkiye (0.30)
North America > Canada (0.28)

Genre: Research Report (0.84)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.92)

Add feedback

Testing Support Size More Efficiently Than Learning Histograms

Pinto, Renato Ferreira Jr., Harms, Nathaniel

arXiv.org Artificial IntelligenceOct-24-2024

Consider two problems about an unknown probability distribution $p$: 1. How many samples from $p$ are required to test if $p$ is supported on $n$ elements or not? Specifically, given samples from $p$, determine whether it is supported on at most $n$ elements, or it is "$\epsilon$-far" (in total variation distance) from being supported on $n$ elements. 2. Given $m$ samples from $p$, what is the largest lower bound on its support size that we can produce? The best known upper bound for problem (1) uses a general algorithm for learning the histogram of the distribution $p$, which requires $\Theta(\tfrac{n}{\epsilon^2 \log n})$ samples. We show that testing can be done more efficiently than learning the histogram, using only $O(\tfrac{n}{\epsilon \log n} \log(1/\epsilon))$ samples, nearly matching the best known lower bound of $\Omega(\tfrac{n}{\epsilon \log n})$. This algorithm also provides a better solution to problem (2), producing larger lower bounds on support size than what follows from previous work. The proof relies on an analysis of Chebyshev polynomial approximations outside the range where they are designed to be good approximations, and the paper is intended as an accessible self-contained exposition of the Chebyshev polynomial method.

artificial intelligence, machine learning, support size, (16 more...)

arXiv.org Artificial Intelligence

2410.18915

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

Add feedback

Screw Geometry Meets Bandits: Incremental Acquisition of Demonstrations to Generate Manipulation Plans

Das, Dibyendu, Patankar, Aditya, Chakraborty, Nilanjan, Ramakrishnan, C. R., Ramakrishnan, I. V.

arXiv.org Artificial IntelligenceOct-23-2024

In this paper, we study the problem of methodically obtaining a sufficient set of kinesthetic demonstrations, one at a time, such that a robot can be confident of its ability to perform a complex manipulation task in a given region of its workspace. Although Learning from Demonstrations has been an active area of research, the problems of checking whether a set of demonstrations is sufficient, and systematically seeking additional demonstrations have remained open. We present a novel approach to address these open problems using (i) a screw geometric representation to generate manipulation plans from demonstrations, which makes the sufficiency of a set of demonstrations measurable; (ii) a sampling strategy based on PAC-learning from multi-armed bandit optimization to evaluate the robot's ability to generate manipulation plans in a subregion of its task space; and (iii) a heuristic to seek additional demonstration from areas of weakness. Thus, we present an approach for the robot to incrementally and actively ask for new demonstration examples until the robot can assess with high confidence that it can perform the task successfully. We present experimental results on two example manipulation tasks, namely, pouring and scooping, to illustrate our approach. A short video on the method: https://youtu.be/R-qICICdEos

data mining, demonstration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.18275

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Add feedback