AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Multiclass versus Binary Differentially Private PAC Learning

Neural Information Processing SystemsJan-19-2025, 00:33:51 GMT

We show a generic reduction from multiclass differentially private PAC learning to binary private PAC learning. We apply this transformation to a recently proposed binary private PAC learner to obtain a private multiclass learner with sample complexity that has a polynomial dependence on the multiclass Littlestone dimension and a poly-logarithmic dependence on the number of classes. This yields a doubly exponential improvement in the dependence on both parameters over learners from previous work. Our proof extends the notion of \Psi -dimension defined in work of Ben-David et al. [JCSS, 1995] to the online setting and explores its general properties.

binary differentially private pac learning, dependence, learner

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.97)

Add feedback

On Optimal Learning Under Targeted Data Poisoning

Neural Information Processing SystemsJan-18-2025, 20:22:27 GMT

Consider the task of learning a hypothesis class \mathcal{H} in the presence of an adversary that can replace up to an \eta fraction of the examples in the training set with arbitrary adversarial examples. The adversary aims to fail the learner on a particular target test point x which is \emph{known} to the adversary but not to the learner. In this work we aim to characterize the smallest achievable error \epsilon \epsilon(\eta) by the learner in the presence of such an adversary in both realizable and agnostic settings. Remarkably, we show that the upper bound can be attained by a deterministic learner. In the agnostic setting we reveal a more elaborate landscape: we devise a deterministic learner with a multiplicative regret guarantee of \epsilon \leq C\cdot\mathtt{OPT} O(\mathtt{VC}(\mathcal{H})\cdot \eta), where C 1 is a universal numerical constant.

learner, mathcal, mathtt, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.53)

Add feedback

Private learning implies quantum stability

Neural Information Processing SystemsJan-18-2025, 14:42:01 GMT

Learning an unknown n-qubit quantum state rho is a fundamental challenge in quantum computing. Information-theoretically, it is known that tomography requires exponential in n many copies of rho to estimate its entries. Motivated by learning theory, Aaronson et al. introduced many (weaker) learning models: the PAC model of learning states (Proceedings of Royal Society A'07), shadow tomography (STOC'18) for learning shadows" of a state, a model that also requires learners to be differentially private (STOC'19) and the online model of learning states (NeurIPS'18). In these models it was shown that an unknown state can be learnedapproximately" using linear in n many copies of rho. But is there any relationship between these models? In this paper we prove a sequence of (information-theoretic) implications from differentially-private PAC learning to online learning and then to quantum stability.Our main result generalizes the recent work of Bun, Livni and Moran (Journal of the ACM'21) who showed that finite Littlestone dimension (of Boolean-valued concept classes) implies PAC learnability in the (approximate) differentially private (DP) setting.

learning, quantum stability, quantum state, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.60)

Add feedback

On the Stability-Plasticity Dilemma in Continual Meta-Learning: Theory and Algorithm

Neural Information Processing SystemsJan-18-2025, 11:02:17 GMT

We focus on Continual Meta-Learning (CML), which targets accumulating and exploiting meta-knowledge on a sequence of non-i.i.d. The primary challenge is to strike a balance between stability and plasticity, where a model should be stable to avoid catastrophic forgetting in previous tasks and plastic to learn generalizable concepts from new tasks. To address this, we formulate the CML objective as controlling the average excess risk upper bound of the task sequence, which reflects the trade-off between forgetting and generalization. Based on the objective, we introduce a unified theoretical framework for CML in both static and shifting environments, providing guarantees for various task-specific learning algorithms. Moreover, we first present a rigorous analysis of a bi-level trade-off in shifting environments.

continual meta-learning, stability-plasticity dilemma, theory and algorithm, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

IMED-RL: Regret optimal learning of ergodic Markov decision processes

Neural Information Processing SystemsJan-18-2025, 11:02:02 GMT

We consider reinforcement learning in a discrete, undiscounted, infinite-horizon Markov decision problem (MDP) under the average reward criterion, and focus on the minimization of the regret with respect to an optimal policy, when the learner does not know the rewards nor transitions of the MDP. In light of their success at regret minimization in multi-armed bandits, popular bandit strategies, such as the optimistic \texttt{UCB}, \texttt{KL-UCB} or the Bayesian Thompson sampling strategy, have been extended to the MDP setup. Despite some key successes, existing strategies for solving this problem either fail to be provably asymptotically optimal, or suffer from prohibitive burn-in phase and computational complexity when implemented in practice. In this work, we shed a novel light on regret minimization strategies, by extending to reinforcement learning the computationally appealing Indexed Minimum Empirical Divergence (\texttt{IMED}) bandit algorithm. Traditional asymptotic problem-dependent lower bounds on the regret are known under the assumption that the MDP is \emph{ergodic}.

ergodic markov decision process, imed-rl, texttt, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization

Neural Information Processing SystemsJan-17-2025, 23:01:48 GMT

We consider linear prediction with a convex Lipschitz loss, or more generally, stochastic convex optimization problems of generalized linear form, i.e. We show that in this setting, early stopped Gradient Descent (GD), without any explicit regularization or projection, ensures excess error at most \varepsilon (compared to the best possible with unit Euclidean norm) with an optimal, up to logarithmic factors, sample complexity of \tilde{O}(1/\varepsilon 2) and only \tilde{O}(1/\varepsilon 2) iterations. This contrasts with general stochastic convex optimization, where \Omega(1/\varepsilon 4) iterations are needed Amir et al. 2021. The lower iteration complexity is ensured by leveraging uniform convergence rather than stability. But instead of uniform convergence in a norm ball, which we show can guarantee suboptimal learning using \Theta(1/\varepsilon 4) samples, we rely on uniform convergence in a distribution-dependent ball.

generalized linear stochastic convex optimization, gradient descent, varepsilon, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons

Neural Information Processing SystemsJan-17-2025, 15:44:13 GMT

The response time of physical computational elements is finite, and neurons are no exception. In hierarchical models of cortical networks each layer thus introduces a response lag. This inherent property of physical dynamical systems results in delayed processing of stimuli and causes a timing mismatch between network output and instructive signals, thus afflicting not only inference, but also learning. We introduce Latent Equilibrium, a new framework for inference and learning in networks of slow components which avoids these issues by harnessing the ability of biological neurons to phase-advance their output with respect to their membrane potential. This principle enables quasi-instantaneous inference independent of network depth and avoids the need for phased plasticity or computationally expensive network relaxation phases.

latent equilibrium, neuron, unified learning theory, (3 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.08)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

A Combinatorial Perspective on the Optimization of Shallow ReLU Networks

Neural Information Processing SystemsJan-17-2025, 15:42:43 GMT

The NP-hard problem of optimizing a shallow ReLU network can be characterized as a combinatorial search over each training example's activation pattern followed by a constrained convex problem given a fixed set of activation patterns. We explore the implications of this combinatorial aspect of ReLU optimization in this work. We show that it can be naturally modeled via a geometric and combinatoric object known as a zonotope with its vertex set isomorphic to the set of feasible activation patterns. This assists in analysis and provides a foundation for further research. We demonstrate its usefulness when we explore the sensitivity of the optimal loss to perturbations of the training data.

activation pattern, combinatorial perspective, shallow relu network, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.43)

Add feedback

SpaceTime: Causal Discovery from Non-Stationary Time Series

Mameche, Sarah, Cornanguer, Lénaïg, Ninad, Urmi, Vreeken, Jilles

arXiv.org Artificial IntelligenceJan-17-2025

Understanding causality is challenging and often complicated by changing causal relationships over time and across environments. Climate patterns, for example, shift over time with recurring seasonal trends, while also depending on geographical characteristics such as ecosystem variability. Existing methods for discovering causal graphs from time series either assume stationarity, do not permit both temporal and spatial distribution changes, or are unaware of locations with the same causal relationships. In this work, we therefore unify the three tasks of causal graph discovery in the non-stationary multi-context setting, of reconstructing temporal regimes, and of partitioning datasets and time intervals into those where invariant causal relationships hold. To construct a consistent score that forms the basis of our method, we employ the Minimum Description Length principle. Our resulting algorithm SPACETIME simultaneously accounts for heterogeneity across space and non-stationarity over time. Given multiple time series, it discovers regime changepoints and a temporal causal graph using non-parametric functional modeling and kernelized discrepancy testing. We also show that our method provides insights into real-world phenomena such as river-runoff measured at different catchments and biosphere-atmosphere interactions across ecosystems.

artificial intelligence, machine learning, mechanism, (14 more...)

arXiv.org Artificial Intelligence

2501.10235

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Evaluated CMI Bounds for Meta Learning: Tightness and Expressiveness

Neural Information Processing SystemsJan-16-2025, 07:40:49 GMT

Recent work has established that the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020) is expressive enough to capture generalization guarantees in terms of algorithmic stability, VC dimension, and related complexity measures for conventional learning (Harutyunyan et al., 2021, Haghifam et al., 2021). Hence, it provides a unified method for establishing generalization bounds. In meta learning, there has so far been a divide between information-theoretic results and results from classical learning theory. In this work, we take a first step toward bridging this divide. Specifically, we present novel generalization bounds for meta learning in terms of the evaluated CMI (e-CMI).

evaluated cmi bound, meta learning, tightness and expressiveness, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.62)

Add feedback