AITopics

2501.05493

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningJan-4-2025

Reweighting Improves Conditional Risk Bounds

Zhang, Yikai, Lin, Jiahe, Li, Fengpei, Zheng, Songzhu, Raj, Anant, Schneider, Anderson, Nevmyvaka, Yuriy

In this work, we study the weighted empirical risk minimization (weighted ERM) schema, in which an additional data-dependent weight function is incorporated when the empirical risk function is being minimized. We show that under a general ``balanceable" Bernstein condition, one can design a weighted ERM estimator to achieve superior performance in certain sub-regions over the one obtained from standard ERM, and the superiority manifests itself through a data-dependent constant term in the error bound. These sub-regions correspond to large-margin ones in classification settings and low-variance ones in heteroscedastic regression settings, respectively. Our findings are supported by evidence from synthetic data experiments.

artificial intelligence, machine learning, theorem 4, (18 more...)

2501.02353

Country:

North America > United States (0.14)
Europe > Hungary (0.14)
Europe > Italy (0.14)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
(2 more...)

arXiv.org Machine LearningJan-1-2025

Ensuring superior learning outcomes and data security for authorized learner

Bang, Jeongho, Song, Wooyeong, Shin, Kyujin, Kim, Yong-Su

The learner's ability to generate a hypothesis that closely approximates the target function is crucial in machine learning. Achieving this requires sufficient data; however, unauthorized access by an eavesdropping learner can lead to security risks. Thus, it is important to ensure the performance of the "authorized" learner by limiting the quality of the training data accessible to eavesdroppers. Unlike previous studies focusing on encryption or access controls, we provide a theorem to ensure superior learning outcomes exclusively for the authorized learner with quantum label encoding. In this context, we use the probably-approximately-correct (PAC) learning framework and introduce the concept of learning probability to quantitatively assess learner performance. Our theorem allows the condition that, given a training dataset, an authorized learner is guaranteed to achieve a certain quality of learning outcome, while eavesdroppers are not. Notably, this condition can be constructed based only on the authorized-learning-only measurable quantities of the training data, i.e., its size and noise degree. We validate our theoretical proofs and predictions through convolutional neural networks (CNNs) image classification learning.

artificial intelligence, learner, machine learning, (15 more...)

2501.00754

Country: Asia > South Korea (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

arXiv.org Machine LearningDec-27-2024

Generation through the lens of learning theory

Li, Jiaxun, Raman, Vinod, Tewari, Ambuj

Over the past 50 years, predictive machine learning has been a cornerstone for both theorists and practitioners. Predictive tasks like classification and regression have been extensively studied, in both theory and practice, due to their applications to face recognition, autonomous vehicles, fraud detection, recommendation systems, etc. Recently, however, a new paradigm of machine learning has emerged: generation. Unlike predictive models, which focus on making accurate predictions of the true label given examples, generative models aim to create new examples based on observed data. For example, in language modeling, the goal might be to generate coherent text in response to a prompt, while in drug development, one might want to generate candidate molecules. In fact, generative models have already been applied to these tasks and others [Zhao et al., 2023, Jumper et al., 2021]. The vast potential of generative machine learning has spurred a surge of research across diverse fields like natural language processing [Wolf et al., 2020], computer vision [Khan et al., 2022], and computational chemistry/biology [Vanhaelen et al., 2020]. Despite this widespread adoption, the theoretical foundations of generative machine learning lags far behind its predictive counterpart. While prediction has been extensively studied by learning theorists through frameworks like PAC and online learning [Shalev-Shwartz and Ben-David, 2014, Mohri et al., 2012, Cesa-Bianchi and Lugosi, 2006], generative machine learning has, for the most, part

artificial intelligence, machine learning, natural language, (19 more...)

2410.13714

Country:

Europe > Hungary (0.14)
North America > United States (0.14)

Genre:

Research Report (0.64)
Overview (0.45)

Industry:

Education > Educational Setting > Online (0.66)
Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.65)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.54)

Kozachinskiy, Alexander, Shen, Alexander, Steifer, Tomasz

Optimal bounds for dissatisfaction in perpetual voting

arXiv.org Artificial IntelligenceDec-20-2024

In perpetual voting, multiple decisions are made at different moments in time. Taking the history of previous decisions into account allows us to satisfy properties such as proportionality over periods of time. In this paper, we consider the following question: is there a perpetual approval voting method that guarantees that no voter is dissatisfied too many times? We identify a sufficient condition on voter behavior -- which we call 'bounded conflicts' condition -- under which a sublinear growth of dissatisfaction is possible. We provide a tight upper bound on the growth of dissatisfaction under bounded conflicts, using techniques from Kolmogorov complexity. We also observe that the approval voting with binary choices mimics the machine learning setting of prediction with expert advice. This allows us to present a voting method with sublinear guarantees on dissatisfaction under bounded conflicts, based on the standard techniques from prediction with expert advice.

agent, artificial intelligence, machine learning, (18 more...)

2501.01969

Country:

South America > Chile (0.14)
Europe > France (0.14)
Europe > Poland (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.67)

arXiv.org Artificial IntelligenceDec-19-2024

Differentially Private Release and Learning of Threshold Functions

Bun, Mark, Nissim, Kobbi, Stemmer, Uri, Vadhan, Salil

We prove new upper and lower bounds on the sample complexity of $(\epsilon, \delta)$ differentially private algorithms for releasing approximate answers to threshold functions. A threshold function $c_x$ over a totally ordered domain $X$ evaluates to $c_x(y) = 1$ if $y \le x$, and evaluates to $0$ otherwise. We give the first nontrivial lower bound for releasing thresholds with $(\epsilon,\delta)$ differential privacy, showing that the task is impossible over an infinite domain $X$, and moreover requires sample complexity $n \ge \Omega(\log^*|X|)$, which grows with the size of the domain. Inspired by the techniques used to prove this lower bound, we give an algorithm for releasing thresholds with $n \le 2^{(1+ o(1))\log^*|X|}$ samples. This improves the previous best upper bound of $8^{(1 + o(1))\log^*|X|}$ (Beimel et al., RANDOM '13). Our sample complexity upper and lower bounds also apply to the tasks of learning distributions with respect to Kolmogorov distance and of properly PAC learning thresholds with differential privacy. The lower bound gives the first separation between the sample complexity of properly learning a concept class with $(\epsilon,\delta)$ differential privacy and learning without privacy. For properly learning thresholds in $\ell$ dimensions, this lower bound extends to $n \ge \Omega(\ell \cdot \log^*|X|)$. To obtain our results, we give reductions in both directions from releasing and properly learning thresholds and the simpler interior point problem. Given a database $D$ of elements from $X$, the interior point problem asks for an element between the smallest and largest elements in $D$. We introduce new recursive constructions for bounding the sample complexity of the interior point problem, as well as further reductions and techniques for proving impossibility results for other basic problems in differential privacy.

algorithm, artificial intelligence, machine learning, (12 more...)

1504.07553

Country:

North America > United States > Louisiana (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.67)

Verma, Rajeev, Fischer, Volker, Nalisnick, Eric

On Calibration in Multi-Distribution Learning

arXiv.org Artificial IntelligenceDec-18-2024

Modern challenges of robustness, fairness, and decision-making in machine learning have led to the formulation of multi-distribution learning (MDL) frameworks in which a predictor is optimized across multiple distributions. We study the calibration properties of MDL to better understand how the predictor performs uniformly across the multiple distributions. Through classical results on decomposing proper scoring losses, we first derive the Bayes optimal rule for MDL, demonstrating that it maximizes the generalized entropy of the associated loss function. Our analysis reveals that while this approach ensures minimal worst-case loss, it can lead to non-uniform calibration errors across the multiple distributions and there is an inherent calibration-refinement trade-off, even at Bayes optimality. Our results highlight a critical limitation: despite the promise of MDL, one must use caution when designing predictors tailored to multiple distributions so as to minimize disparity.

artificial intelligence, machine learning, predictor, (18 more...)

2412.14142

Country: Europe (0.46)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

arXiv.org Artificial IntelligenceDec-18-2024

Representative Social Choice: From Learning Theory to AI Alignment

Qiu, Tianyi

Social choice theory is the study of preference aggregation across a population, used both in mechanism design for human agents and in the democratic alignment of language models. In this study, we propose the representative social choice framework for the modeling of democratic representation in collective decisions, where the number of issues and individuals are too large for mechanisms to consider all preferences directly. These scenarios are widespread in real-world decision-making processes, such as jury trials, indirect elections, legislation processes, corporate governance, and, more recently, language model alignment. In representative social choice, the population is represented by a finite sample of individual-issue pairs based on which social choice decisions are made. We show that many of the deepest questions in representative social choice can be naturally formulated as statistical learning problems, and prove the generalization properties of social choice mechanisms using the theory of machine learning. We further formulate axioms for representative social choice, and prove Arrow-like impossibility theorems with new combinatorial tools of analysis. Our framework introduces the representative approach to social choice, opening up research directions at the intersection of social choice, learning theory, and AI alignment.

artificial intelligence, machine learning, mechanism, (13 more...)

2410.23953

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.85)

Kozyrev, Sergei V., Lopatin, Ilya A, Pechen, Alexander N

Control of Overfitting with Physics

arXiv.org Machine LearningDec-14-2024

Analogies from physics and other fields, particularly population genetics, are of interest when studying problems in machine learning theory. Analogies between machine learning theory and Darwinian evolution theory were discussed already by Alan Turing [1]. Biological analogies in computing were discussed by John von Neumann [2]. Physical models in relation to computing were discussed by Yuri Manin [3]. Such analogies allow physical intuition to be used in learning theory. Among the well-known examples are genetic [4] and evolutionary algorithms [5], models of neural networks and physical systems with emergent collective computational abilities and contentaddressable memory [6], a parallel search learning method based on statistical mechanics and Boltzmann machines that mimic Ising spin chains [7]. A phenomenological model of population genetics, the Lotka-Volterra model with mutations, related to generative adversarial network (GAN) was introduced in [8]. Analogies between evolution operator in physics and transformers (an artificial intelligence model) were discussed in [9]. Ideas of thermodynamics in application to learning were considered in [10,11] and in relation to the evolution theory in [12,13].

artificial intelligence, generator, machine learning, (12 more...)

doi: 10.3390/e26121090

2412.10716

Country:

North America > Canada (0.46)
North America > United States > Massachusetts (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Hira, Rupkatha, Kau, Dominik, Sorrell, Jessica

The Cost of Replicability in Active Learning

arXiv.org Artificial IntelligenceDec-12-2024

Active learning aims to reduce the required number of labeled data for machine learning algorithms by selectively querying the labels of initially unlabeled data points. Ensuring the replicability of results, where an algorithm consistently produces the same outcome across different runs, is essential for the reliability of machine learning models but often increases sample complexity. This report investigates the cost of replicability in active learning using the CAL algorithm, a classical disagreement-based active learning method. By integrating replicable statistical query subroutines and random thresholding techniques, we propose two versions of a replicable CAL algorithm. Our theoretical analysis demonstrates that while replicability does increase label complexity, the CAL algorithm can still achieve significant savings in label complexity even with the replicability constraint. These findings offer valuable insights into balancing efficiency and robustness in machine learning models.

artificial intelligence, hypothesis, machine learning, (18 more...)

2412.09686

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)