AITopics

2605.30324

Country:

North America > United States (0.28)
Europe (0.27)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.81)

arXiv.org Machine LearningApr-24-2026

The Sample Complexity of Multicalibration

Collina, Natalie, Lu, Jiuyao, Noarov, Georgy, Roth, Aaron

We study the minimax sample complexity of multicalibration in the batch setting. A learner observes $n$ i.i.d. samples from an unknown distribution and must output a (possibly randomized) predictor whose population multicalibration error, measured by Expected Calibration Error (ECE), is at most $\varepsilon$ with respect to a given family of groups. For every fixed $κ> 0$, in the regime $|G|\le \varepsilon^{-κ}$, we prove that $\widetildeΘ(\varepsilon^{-3})$ samples are necessary and sufficient, up to polylogarithmic factors. The lower bound holds even for randomized predictors, and the upper bound is realized by a randomized predictor obtained via an online-to-batch reduction. This separates the sample complexity of multicalibration from that of marginal calibration, which scales as $\widetildeΘ(\varepsilon^{-2})$, and shows that mean-ECE multicalibration is as difficult in the batch setting as it is in the online setting, in contrast to marginal calibration which is strictly more difficult in the online setting. In contrast we observe that for $κ= 0$, the sample complexity of multicalibration remains $\widetildeΘ(\varepsilon^{-2})$ exhibiting a sharp threshold phenomenon. More generally, we establish matching upper and lower bounds, up to polylogarithmic factors, for a weighted $L_p$ multicalibration metric for all $1 \le p \le 2$, with optimal exponent $3/p$. We also extend the lower-bound template to a regular class of elicitable properties, and combine it with the online upper bounds of Hu et al. (2025) to obtain matching bounds for calibrating properties including expectiles and bounded-density quantiles.

artificial intelligence, machine learning, multicalibration, (16 more...)

2604.21923

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)

Mayeux, Arnaud, Zhang, Jujian

The mechanization of science illustrated by the Lean formalization of the multi-graded Proj construction

arXiv.org Artificial IntelligenceSep-19-2025

Arnaud Mayeux and Jujian Zhang Efforts to mechanize aspects of scientific reasoning have been intertwined with the development of science from its earliest days. C1 "Whenever we have a long, difficult piece of algebra, and we have them more and more often these days, we could at least get the machine to check that the algebra was right before we went on and built further stages of derivation on top. Some people are working on such programs for algebra checking right now." C2 "Now I leave the region of known processes and enter the land of speculation. We can, I believe, reasonably expect that an algebra checking routine would not be around very long before someone would adapt the methods of heuristics that are presently being developed to the problem of doing algebra in a more creative way. The machine could supply several steps at a time, and be given only a guiding thread of a proof. The more successful the heuristics, the fewer steps we would have to supply."

artificial intelligence, logic & formal reasoning, submonoid, (20 more...)

2509.15116

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.48)

Areces, Felipe, Duchi, John C.

Distribution free M-estimation

arXiv.org Artificial IntelligenceAug-27-2025

The basic question of delineating those statistical problems that are solvable without making any assumptions on the underlying data distribution has long animated statistics and learning theory. This paper characterizes when a convex M-estimation or stochastic optimization problem is solvable in such an assumption-free setting, providing a precise dividing line between solvable and unsolvable problems. The conditions we identify show, perhaps surprisingly, that Lipschitz continuity of the loss being minimized is not necessary for distribution free minimization, and they are also distinct from classical characterizations of learnability in machine learning.

artificial intelligence, lemma 3, machine learning, (16 more...)

2505.22807

Country: North America > United States (0.28)

Genre: Research Report (0.63)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Chornomaz, Bogdan, Moran, Shay, Waknine, Tom

Spherical dimension

arXiv.org Artificial IntelligenceMar-13-2025

We introduce and study the spherical dimension, a natural topological relaxation of the VC dimension that unifies several results in learning theory where topology plays a key role in the proofs. The spherical dimension is defined by extending the set of realizable datasets (used to define the VC dimension) to the continuous space of realizable distributions. In this space, a shattered set of size d (in the VC sense) is completed into a continuous object, specifically a d-dimensional sphere of realizable distributions. The spherical dimension is then defined as the dimension of the largest sphere in this space. Thus, the spherical dimension is at least the VC dimension. The spherical dimension serves as a common foundation for leveraging the Borsuk-Ulam theorem and related topological tools. We demonstrate the utility of the spherical dimension in diverse applications, including disambiguations of partial concept classes, reductions from classification to stochastic convex optimization, stability and replicability, and sample compression schemes. Perhaps surprisingly, we show that the open question posed by Alon, Hanneke, Holzman, and Moran (FOCS 2021) of whether there exist non-trivial disambiguations for halfspaces with margin is equivalent to the basic open question of whether the VC and spherical dimensions are finite together.

dimension, disambiguation, spherical dimension, (15 more...)

2503.1024

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
(4 more...)

Genre: Research Report (0.63)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Gerstenberg, Julian, Neininger, Ralph, Spiegel, Denis

On Policy Evaluation Algorithms in Distributional Reinforcement Learning

arXiv.org Machine LearningJul-19-2024

We introduce a novel class of algorithms to efficiently approximate the unknown return distributions in policy evaluation problems from distributional reinforcement learning (DRL). The proposed distributional dynamic programming algorithms are suitable for underlying Markov decision processes (MDPs) having an arbitrary probabilistic reward mechanism, including continuous reward distributions with unbounded support being potentially heavy-tailed. For a plain instance of our proposed class of algorithms we prove error bounds, both within Wasserstein and Kolmogorov--Smirnov distances. Furthermore, for return distributions having probability density functions the algorithms yield approximations for these densities; error bounds are given within supremum norm. We introduce the concept of quantile-spline discretizations to come up with algorithms showing promising results in simulation experiments. While the performance of our algorithms can rigorously be analysed they can be seen as universal black box algorithms applicable to a large class of MDPs. We also derive new properties of probability metrics commonly used in DRL on which our quantitative analysis is based.

algorithm, approximation, parameter algorithm, (12 more...)

2407.14175

Country:

Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

de Lara, Lucas, Deronzier, Mathis, González-Sanz, Alberto, Foy, Virgile

On the nonconvexity of some push-forward constraints and its consequences in machine learning

arXiv.org Machine LearningMar-12-2024

The push-forward operation enables one to redistribute a probability measure through a deterministic map. It plays a key role in statistics and optimization: many learning problems (notably from optimal transport, generative modeling, and algorithmic fairness) include constraints or penalties framed as push-forward conditions on the model. However, the literature lacks general theoretical insights on the (non)convexity of such constraints and its consequences on the associated learning problems. This paper aims at filling this gap. In a first part, we provide a range of sufficient and necessary conditions for the (non)convexity of two sets of functions: the maps transporting one probability measure to another; the maps inducing equal output distributions across distinct probability measures. This highlights that for most probability measures, these push-forward constraints are not convex. In a second time, we show how this result implies critical limitations on the design of convex optimization problems for learning generative models or group-fair predictors. This work will hopefully help researchers and practitioners have a better understanding of the critical impact of push-forward conditions onto convexity.

convex, convexity, probability measure, (15 more...)

2403.07471

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Ganguly, Soumya, Tran, Khoa, Sarkar, Rahul

Uniform $\mathcal{C}^k$ Approximation of $G$-Invariant and Antisymmetric Functions, Embedding Dimensions, and Polynomial Representations

arXiv.org Artificial IntelligenceMar-2-2024

For any subgroup $G$ of the symmetric group $\mathcal{S}_n$ on $n$ symbols, we present results for the uniform $\mathcal{C}^k$ approximation of $G$-invariant functions by $G$-invariant polynomials. For the case of totally symmetric functions ($G = \mathcal{S}_n$), we show that this gives rise to the sum-decomposition Deep Sets ansatz of Zaheer et al. (2018), where both the inner and outer functions can be chosen to be smooth, and moreover, the inner function can be chosen to be independent of the target function being approximated. In particular, we show that the embedding dimension required is independent of the regularity of the target function, the accuracy of the desired approximation, as well as $k$. Next, we show that a similar procedure allows us to obtain a uniform $\mathcal{C}^k$ approximation of antisymmetric functions as a sum of $K$ terms, where each term is a product of a smooth totally symmetric function and a smooth antisymmetric homogeneous polynomial of degree at most $\binom{n}{2}$. We also provide upper and lower bounds on $K$ and show that $K$ is independent of the regularity of the target function, the desired approximation accuracy, and $k$.

approximation, polynomial, representation, (15 more...)

2403.01339

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Data Science > Data Mining (0.67)

Caprio, Michele, Sultana, Maryam, Elia, Eleni, Cuzzolin, Fabio

Credal Learning Theory

arXiv.org Artificial IntelligenceFeb-1-2024

Statistical learning theory is the foundation of machine learning, providing theoretical bounds for the risk of models learnt from a (single) training set, assumed to issue from an unknown probability distribution. In actual deployment, however, the data distribution may (and often does) vary, causing domain adaptation/generalization issues. In this paper we lay the foundations for a `credal' theory of learning, using convex sets of probabilities (credal sets) to model the variability in the data-generating distribution. Such credal sets, we argue, may be inferred from a finite sample of training sets. Bounds are derived for the case of finite hypotheses spaces (both assuming realizability or not) as well as infinite model spaces, which directly generalize classical results.

corollary 4, credal, theorem 4, (15 more...)

2402.00957

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.56)

Journal of Artificial Intelligence ResearchNov-11-2023

How to Tell Easy from Hard: Complexities of Conjunctive Query Entailment in Extensions of ALC

Bednarczyk, Bartosz (TU Dresden, DE) | Rudolph, Sebastian (TU Dresden)

It is commonly known that the conjunctive query entailment problem for certain extensions of (the well-known ontology language) ALC is computationally harder than their knowledge base satisfiability problem while for others the complexities coincide, both under the standard and the finite-model semantics. We expose a uniform principle behind this divide by identifying a wide class of (finitely) locally-forward description logics, for which we prove that (finite) query entailment problem can be solved by a reduction to exponentially many calls of the (finite) knowledge base satisfiability problem. Consequently, our algorithm yields tight ExpTime upper bounds for locally-forward logics with ExpTime-complete knowledge base satisfiability problem, including logics between ALC and µALCHbregQ (and more), as well as ALCSCC with global cardinality constraints, for which the complexity of querying remained open. Moreover, to make our technique applicable in future research, we provide easy-to-check sufficient conditions for a logic to be locally-forward based several versions of the on model-theoretic notion of unravellings. Together with existing results, this provides a nearly complete classification of the “benign” vs. “malign” primitive modelling features extending ALC, missing out only the Self operator. We then show a rather counter-intuitive result, namely that the conjunctive entailment problem for ALCSelf is exponentially harder than for ALC. This places the seemingly innocuous Self operator among the “malign” modelling features, like inverses, transitivity or nominals.

definition 4, homomorphism, interpretation, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.14482

AI Access Foundation

14482

Journal of Artificial Intelligence Research

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > Saxony > Dresden (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(19 more...)

Genre: Research Report (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)