AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Local Regularizers Are Not Transductive Learners

Jafar, Sky, Asilis, Julian, Dughmi, Shaddin

arXiv.org Machine LearningFeb-10-2025

We partly resolve an open question raised by Asilis et al. (COLT 2024): whether the algorithmic template of local regularization -- an intriguing generalization of explicit regularization, a.k.a. structural risk minimization -- suffices to learn all learnable multiclass problems. Specifically, we provide a negative answer to this question in the transductive model of learning. We exhibit a multiclass classification problem which is learnable in both the transductive and PAC models, yet cannot be learned transductively by any local regularizer. The corresponding hypothesis class, and our proof, are based on principles from cryptographic secret sharing. We outline challenges in extending our negative result to the PAC model, leaving open the tantalizing possibility of a PAC/transductive separation with respect to local regularization.

artificial intelligence, learner, machine learning, (15 more...)

arXiv.org Machine Learning

2502.07187

Country:

Europe (0.28)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsFeb-9-2025, 05:06:32 GMT

This paper studies the problem of learning shffle ideals in the PAC-learning model under some simple distributions such as product distribution, or distributions generated by Markov chains. A shuffle ideal is a particularly simple type of regular language. Given alphabet \Sigma, and a string u (u_1, …, u_L) the prime shuffle ideal generated by u is the set of strings obtained by inserting any strings s_0, …, s_L in u to get s_0 u_1 s_1 … u_L s_L. The shuffle ideal is a more general class of regular languages where instead of a string u we consider strings in a product set U U_1 \times … \times U_L where U_i \subseteq \Sigma and take the union of the prime shuffle ideals generated by strings in U. The problem of learning regular languages has many applications.

artificial intelligence, machine learning, shuffle ideal, (12 more...)

Neural Information Processing Systems

Genre: Research Report (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.37)

Add feedback

On the Computability of Multiclass PAC Learning

Gourdeau, Pascale, Lechner, Tosca, Urner, Ruth

arXiv.org Machine LearningFeb-9-2025

We study the problem of computable multiclass learnability within the Probably Approximately Correct (PAC) learning framework of Valiant (1984). In the recently introduced computable PAC (CPAC) learning framework of Agarwal et al. (2020), both learners and the functions they output are required to be computable. We focus on the case of finite label space and start by proposing a computable version of the Natarajan dimension and showing that it characterizes CPAC learnability in this setting. We further generalize this result by establishing a meta-characterization of CPAC learnability for a certain family of dimensions: computable distinguishers. Distinguishers were defined by Ben-David et al. (1992) as a certain family of embeddings of the label space, with each embedding giving rise to a dimension. It was shown that the finiteness of each such dimension characterizes multiclass PAC learnability for finite label space in the non-computable setting. We show that the corresponding computable dimensions for distinguishers characterize CPAC learning. We conclude our analysis by proving that the DS dimension, which characterizes PAC learnability for infinite label space, cannot be expressed as a distinguisher (even in the case of finite label space).

artificial intelligence, dimension, machine learning, (18 more...)

arXiv.org Machine Learning

2502.06089

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Review for NeurIPS paper: Experimental design for MRI by greedy policy search

Neural Information Processing SystemsFeb-7-2025, 01:46:14 GMT

Weaknesses: The main claim of the paper is the hypothesis that the noise in the non-greedy objective's paper is the reason why the greedy method can outperform it. However, I think that the empirical methodology is not strong enough to back this claim up, as the experiments are carried out on a single dataset, using a single network architecture and reported with a single performance metric. I think that the hypothesis would be much more clearly substantiated if the noise in the gradients were shown to be a consistent trend in various setting; I am afraid that, in the current state, the conclusion could be an anecdotical performance of the given setting. In addition, if I'm correct, RL models are prone to unstable training and are generally hard to train well. How can you confidently ensure that this behaviour isn't due to the RL policy not being trained for long enough?

experimental design, greedy policy search, noise, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

Review for NeurIPS paper: Experimental design for MRI by greedy policy search

Neural Information Processing SystemsFeb-7-2025, 01:46:06 GMT

Three knowledgeable referees agree that the paper makes a valuable contribution to the MRI acceleration and reconstruction literature. They recognize the soundness of the proposed approach and its compelling experimental validation. I agree with the referees and recommend acceptance.

experimental design, greedy policy search, neurips paper, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.40)

Add feedback

Efficient Optimal PAC Learning

Høgsgaard, Mikael Møller

arXiv.org Artificial IntelligenceFeb-7-2025

Recent advances in the binary classification setting by Hanneke [2016b] and Larsen [2023] have resulted in optimal PAC learners. These learners leverage, respectively, a clever deterministic subsampling scheme and the classic heuristic of bagging Breiman [1996]. Both optimal PAC learners use, as a subroutine, the natural algorithm of empirical risk minimization. Consequently, the computational cost of these optimal PAC learners is tied to that of the empirical risk minimizer algorithm. In this work, we seek to provide an alternative perspective on the computational cost imposed by the link to the empirical risk minimizer algorithm. To this end, we show the existence of an optimal PAC learner, which offers a different tradeoff in terms of the computational cost induced by the empirical risk minimizer.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Artificial Intelligence

2502.0362

Country:

Europe (0.67)
North America > United States (0.28)

Genre: Research Report (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

On characterizing optimal learning trajectories in a class of learning problems

Befekadu, Getachew K

arXiv.org Machine LearningFeb-6-2025

In this brief paper, we provide a mathematical framework that exploits the relationship between the maximum principle and dynamic programming for characterizing optimal learning trajectories in a class of learning problem, which is related to point estimations for modeling of high-dimensional nonlinear functions. Here, such characterization for the optimal learning trajectories is associated with the solution of an optimal control problem for a weakly-controlled gradient system with small parameters, whose time-evolution is guided by a model training dataset and its perturbed version, while the optimization problem consists of a cost functional that summarizes how to gauge the quality/performance of the estimated model parameters at a certain fixed final time w.r.t. a model validating dataset. Moreover, using a successive Galerkin approximation method, we provide an algorithmic recipe how to construct the corresponding optimal learning trajectories leading to the optimal estimated model parameters for such a class of learning problem.

artificial intelligence, machine learning, optimal, (15 more...)

arXiv.org Machine Learning

2501.16521

Country: North America > United States (0.15)

Genre: Research Report (0.40)

Industry: Education > Focused Education > Special Education (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Blackwell's Approachability with Approximation Algorithms

Garber, Dan, Massalha, Mhna

arXiv.org Artificial IntelligenceFeb-6-2025

We revisit Blackwell's celebrated approachability problem which considers a repeated vector-valued game between a player and an adversary. Motivated by settings in which the action set of the player or adversary (or both) is difficult to optimize over, for instance when it corresponds to the set of all possible solutions to some NP-Hard optimization problem, we ask what can the player guarantee \textit{efficiently}, when only having access to these sets via approximation algorithms with ratios $\alpha_{\mX} \geq 1$ and $ 1 \geq \alpha_{\mY} > 0$, respectively. Assuming the player has monotone preferences, in the sense that he does not prefer a vector-valued loss $\ell_1$ over $\ell_2$ if $\ell_2 \leq \ell_1$, we establish that given a Blackwell instance with an approachable target set $S$, the downward closure of the appropriately-scaled set $\alpha_{\mX}\alpha_{\mY}^{-1}S$ is \textit{efficiently} approachable with optimal rate. In case only the player's or adversary's set is equipped with an approximation algorithm, we give simpler and more efficient algorithms.

approximation oracle, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.03919

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Add feedback

Sample Complexity of Bias Detection with Subsampled Point-to-Subspace Distances

Matilla, German Martinez, Marecek, Jakub

arXiv.org Artificial IntelligenceFeb-4-2025

Sample complexity of bias estimation is a lower bound on the runtime of any bias detection method. Many regulatory frameworks require the bias to be tested for all subgroups, whose number grows exponentially with the number of protected attributes. Unless one wishes to run a bias detection with a doubly-exponential run-time, one should like to have polynomial complexity of bias detection for a single subgroup. At the same time, the reference data may be based on surveys, and thus come with non-trivial uncertainty. Here, we reformulate bias detection as a point-to-subspace problem on the space of measures and show that, for supremum norm, it can be subsampled efficiently. In particular, our probabilistically approximately correct (PAC) results are corroborated by tests on well-known instances.

artificial intelligence, complexity, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2502.02623

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)

Add feedback

Networks with Finite VC Dimension: Pro and Contra

Kurkova, Vera, Sanguineti, Marcello

arXiv.org Machine LearningFeb-4-2025

Approximation and learning of classifiers of large data sets by neural networks in terms of high-dimensional geometry and statistical learning theory are investigated. The influence of the VC dimension of sets of input-output functions of networks on approximation capabilities is compared with its influence on consistency in learning from samples of data. It is shown that, whereas finite VC dimension is desirable for uniform convergence of empirical errors, it may not be desirable for approximation of functions drawn from a probability distribution modeling the likelihood that they occur in a given type of application. Based on the concentration-of-measure properties of high dimensional geometry, it is proven that both errors in approximation and empirical errors behave almost deterministically for networks implementing sets of input-output functions with finite VC dimensions in processing large data sets. Practical limitations of the universal approximation property, the trade-offs between the accuracy of approximation and consistency in learning from data, and the influence of depth of networks with ReLU units on their accuracy and consistency are discussed.

artificial intelligence, finite vc dimension, machine learning

arXiv.org Machine Learning

2502.02679

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.93)

Add feedback