AITopics

2502.07232

Country: North America > Canada > Ontario (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

arXiv.org Machine LearningFeb-9-2025

On the Computability of Multiclass PAC Learning

Gourdeau, Pascale, Lechner, Tosca, Urner, Ruth

We study the problem of computable multiclass learnability within the Probably Approximately Correct (PAC) learning framework of Valiant (1984). In the recently introduced computable PAC (CPAC) learning framework of Agarwal et al. (2020), both learners and the functions they output are required to be computable. We focus on the case of finite label space and start by proposing a computable version of the Natarajan dimension and showing that it characterizes CPAC learnability in this setting. We further generalize this result by establishing a meta-characterization of CPAC learnability for a certain family of dimensions: computable distinguishers. Distinguishers were defined by Ben-David et al. (1992) as a certain family of embeddings of the label space, with each embedding giving rise to a dimension. It was shown that the finiteness of each such dimension characterizes multiclass PAC learnability for finite label space in the non-computable setting. We show that the corresponding computable dimensions for distinguishers characterize CPAC learning. We conclude our analysis by proving that the DS dimension, which characterizes PAC learnability for infinite label space, cannot be expressed as a distinguisher (even in the case of finite label space).

artificial intelligence, dimension, machine learning, (18 more...)

2502.06089

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

arXiv.org Artificial IntelligenceDec-1-2024

Calibration through the Lens of Interpretability

Torabian, Alireza, Urner, Ruth

In many applications it is important that a classification model not only has high accuracy but that a user is also provided with a reliable estimate of confidence in the predicted label. Calibration is a concept that is often invoked to provide such confidence estimates to a user. As such, calibration is a notion that is inherently aimed at human interpretation. In binary classification, a perfectly calibrated model f provides the guarantee that if it predicts fpxq " p on some instance x, then among the set of all instances on which f assigns this value p the probability of label 1 is indeed p (and the probability of label 0 thus 1 p). While calibration is generally considered useful, we would argue that in many cases, even if achieved, it is doomed to fail at its original goal of providing insight to a human user: for most suitably complex classification models, a human user that observes fpxq " p has no notion of the set of all instances on which f also outputs p.

artificial intelligence, calibration, machine learning, (15 more...)

doi: 10.1007/978-3-031-63800-8_11

2412.00943

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Artificial IntelligenceJun-14-2024

On the Computability of Robust PAC Learning

Gourdeau, Pascale, Lechner, Tosca, Urner, Ruth

We initiate the study of computability requirements for adversarially robust learning. Adversarially robust PAC-type learnability is by now an established field of research. However, the effects of computability requirements in PAC-type frameworks are only just starting to emerge. We introduce the problem of robust computable PAC (robust CPAC) learning and provide some simple sufficient conditions for this. We then show that learnability in this setup is not implied by the combination of its components: classes that are both CPAC and robustly PAC learnable are not necessarily robustly CPAC learnable. Furthermore, we show that the novel framework exhibits some surprising effects: for robust CPAC learnability it is not required that the robust loss is computably evaluable! Towards understanding characterizing properties, we introduce a novel dimension, the computable robust shattering dimension. We prove that its finiteness is necessary, but not sufficient for robust CPAC learnability. This might yield novel insights for the corresponding phenomenon in the context of robust PAC learnability, where insufficiency of the robust shattering dimension for learnability has been conjectured, but so far a resolution has remained elusive.

artificial intelligence, learnability, machine learning, (15 more...)

2406.10161

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)

arXiv.org Artificial IntelligenceFeb-14-2023

Adversarially Robust Learning with Tolerance

Ashtiani, Hassan, Pathak, Vinayak, Urner, Ruth

We initiate the study of tolerant adversarial PAC-learning with respect to metric perturbation sets. In adversarial PAC-learning, an adversary is allowed to replace a test point $x$ with an arbitrary point in a closed ball of radius $r$ centered at $x$. In the tolerant version, the error of the learner is compared with the best achievable error with respect to a slightly larger perturbation radius $(1+\gamma)r$. This simple tweak helps us bridge the gap between theory and practice and obtain the first PAC-type guarantees for algorithmic techniques that are popular in practice. Our first result concerns the widely-used ``perturb-and-smooth'' approach for adversarial learning. For perturbation sets with doubling dimension $d$, we show that a variant of these approaches PAC-learns any hypothesis class $\mathcal{H}$ with VC-dimension $v$ in the $\gamma$-tolerant adversarial setting with $O\left(\frac{v(1+1/\gamma)^{O(d)}}{\varepsilon}\right)$ samples. This is in contrast to the traditional (non-tolerant) setting in which, as we show, the perturb-and-smooth approach can provably fail. Our second result shows that one can PAC-learn the same class using $\widetilde{O}\left(\frac{d.v\log(1+1/\gamma)}{\varepsilon^2}\right)$ samples even in the agnostic setting. This result is based on a novel compression-based algorithm, and achieves a linear dependence on the doubling dimension as well as the VC-dimension. This is in contrast to the non-tolerant setting where there is no known sample complexity upper bound that depend polynomially on the VC-dimension.

artificial intelligence, machine learning, perturbation, (16 more...)

2203.00849

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

arXiv.org Machine LearningJun-24-2021

On the (Un-)Avoidability of Adversarial Examples

Chowdhury, Sadia, Urner, Ruth

The phenomenon of adversarial examples in deep learning models has caused substantial concern over their reliability. While many deep neural networks have shown impressive performance in terms of predictive accuracy, it has been shown that in many instances an imperceptible perturbation can falsely flip the network's prediction. Most research has then focused on developing defenses against adversarial attacks or learning under a worst-case adversarial loss. In this work, we take a step back and aim to provide a framework for determining whether a model's label change under small perturbation is justified (and when it is not). We carefully argue that adversarial robustness should be defined as a locally adaptive measure complying with the underlying distribution. We then suggest a definition for an adaptive robust loss, derive an empirical version of it, and develop a resulting data-augmentation framework. We prove that our adaptive data-augmentation maintains consistency of 1-nearest neighbor classification under deterministic labels and provide illustrative empirical evaluations.

deep learning, neural network, predictor, (21 more...)

2106.13326

Country: North America > Canada (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJun-30-2020

Black-box Certification and Learning under Adversarial Perturbations

Ashtiani, Hassan, Pathak, Vinayak, Urner, Ruth

We formally study the problem of classification under adversarial perturbations, both from the learner's perspective, and from the viewpoint of a third-party who aims at certifying the robustness of a given black-box classifier. We analyze a PAC-type framework of semi-supervised learning and identify possibility and impossibility results for proper learning of VC-classes in this setting. We further introduce and study a new setting of black-box certification under limited query budget. We analyze this for various classes of predictors and types of perturbation. We also consider the viewpoint of a black-box adversary that aims at finding adversarial examples, showing that the existence of an adversary with polynomial query complexity implies the existence of a robust learner with small sample complexity.

adversary, air transportation, neural network, (16 more...)

2006.1652

Country:

North America > Canada > Ontario (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.81)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.35)

arXiv.org Machine LearningMay-28-2019

When can unlabeled data improve the learning rate?

Göpfert, Christina, Ben-David, Shai, Bousquet, Olivier, Gelly, Sylvain, Tolstikhin, Ilya, Urner, Ruth

In semi-supervised classification, one is given access both to labeled and unlabeled data. As unlabeled data is typically cheaper to acquire than labeled data, this setup becomes advantageous as soon as one can exploit the unlabeled data in order to produce a better classifier than with labeled data alone. However, the conditions under which such an improvement is possible are not fully understood yet. Our analysis focuses on improvements in the minimax learning rate in terms of the number of labeled examples (with the number of unlabeled examples being allowed to depend on the number of labeled ones). We argue that for such improvements to be realistic and indisputable, certain specific conditions should be satisfied and previous analyses have failed to meet those conditions. We then demonstrate examples where these conditions can be met, in particular showing rate changes from $1/\sqrt{\ell}$ to $e^{-c\ell}$ and from $1/\sqrt{\ell}$ to $1/\ell$. These results improve our understanding of what is and isn't possible in semi-supervised learning.

artificial intelligence, machine learning, unlabeled data, (16 more...)

1905.11866

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)

Neural Information Processing SystemsDec-31-2016

Lifelong Learning with Weighted Majority Votes

Pentina, Anastasia, Urner, Ruth

Better understanding of the potential benefits of information transfer and representation learning is an important step towards the goal of building intelligent systems that are able to persist in the world and learn over time. In this work, we consider a setting where the learner encounters a stream of tasks but is able to retain only limited information from each encountered task, such as a learned predictor. In contrast to most previous works analyzing this scenario, we do not make any distributional assumptions on the task generating process. Instead, we formulate a complexity measure that captures the diversity of the observed tasks. We provide a lifelong learning algorithm with error guarantees for every observed task (rather than on average). We show sample complexity reductions in comparison to solving every task in isolation in terms of our task complexity measure. Further, our algorithmic framework can naturally be viewed as learning a representation from encountered tasks with a neural network.

educational setting, hypothesis, neural network, (16 more...)

Neural Information Processing Systems

Country:

Europe (0.93)
North America > United States (0.28)

Genre: Instructional Material (0.64)

Industry: Education > Educational Setting > Continuing Education (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Neural Information Processing SystemsDec-31-2016

Active Nearest-Neighbor Learning in Metric Spaces

Kontorovich, Aryeh, Sabato, Sivan, Urner, Ruth

We propose a pool-based non-parametric active learning algorithm for general metric spaces, called MArgin Regularized Metric Active Nearest Neighbor (MARMANN), which outputs a nearest-neighbor classifier. We give prediction error guarantees that depend on the noisy-margin properties of the input sample, and are competitive with those obtained by previously proposed passive learners. We prove that the label complexity of MARMANN is significantly lower than that of any passive learner with similar error guarantees. Our algorithm is based on a generalized sample compression scheme and a new label-efficient active model-selection procedure.

artificial intelligence, compression, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.47)
Europe (0.46)
Asia > Middle East > Israel (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.61)