AITopics | Edmonton

The Sauer-Shelah-Perles Lemma is a cornerstone of combinatorics and learning theory, bounding the size of a binary hypothesis class in terms of its Vapnik-Chervonenkis (VC) dimension. For classes of functions over a $k$-ary alphabet, namely the multiclass setting, the Natarajan dimension has long served as an analogue of VC dimension, yet the corresponding Sauer-type bounds are suboptimal for alphabet sizes $k>2$. In this work, we establish a sharp Sauer inequality for multiclass and list prediction. Our bound is expressed in terms of the Daniely--Shalev-Shwartz (DS) dimension, and more generally with its extension, the list-DS dimension -- the combinatorial parameters that characterize multiclass and list PAC learnability. Our bound is tight for every alphabet size $k$, list size $\ell$, and dimension value, replacing the exponential dependence on $\ell$ in the Natarajan-based bound by the optimal polynomial dependence, and improving the dependence on $k$ as well. Our proof uses the polynomial method. In contrast to the classical VC case, where several direct combinatorial proofs are known, we are not aware of any purely combinatorial proof in the DS setting. This motivates several directions for future research, which are discussed in the paper. As consequences, we obtain improved sample complexity upper bounds for list PAC learning and for uniform convergence of list predictors, sharpening the recent results of Charikar et al.~(STOC~2023), Hanneke et al.~(COLT~2024), and Brukhim et al.~(NeurIPS~2024).

artificial intelligence, dimension, machine learning, (16 more...)

arXiv.org Machine Learning

2604.12952

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.24)
North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(6 more...)

Genre:

Instructional Material (0.46)
Research Report (0.40)

Industry: Education > Educational Setting (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Differentially Private Language Generation and Identification in the Limit

Mehrotra, Anay, Velegkas, Grigoris, Yu, Xifan, Zhou, Felix

arXiv.org Machine LearningApr-10-2026

We initiate the study of language generation in the limit, a model recently introduced by Kleinberg and Mullainathan [KM24], under the constraint of differential privacy. We consider the continual release model, where a generator must eventually output a stream of valid strings while protecting the privacy of the entire input sequence. Our first main result is that for countable collections of languages, privacy comes at no qualitative cost: we provide an $\varepsilon$-differentially-private algorithm that generates in the limit from any countable collection. This stands in contrast to many learning settings where privacy renders learnability impossible. However, privacy does impose a quantitative cost: there are finite collections of size $k$ for which uniform private generation requires $Ω(k/\varepsilon)$ samples, whereas just one sample suffices non-privately. We then turn to the harder problem of language identification in the limit. Here, we show that privacy creates fundamental barriers. We prove that no $\varepsilon$-DP algorithm can identify a collection containing two languages with an infinite intersection and a finite set difference, a condition far stronger than the classical non-private characterization of identification. Next, we turn to the stochastic setting where the sample strings are sampled i.i.d. from a distribution (instead of being generated by an adversary). Here, we show that private identification is possible if and only if the collection is identifiable in the adversarial model. Together, our results establish new dimensions along which generation and identification differ and, for identification, a separation between adversarial and stochastic settings induced by privacy constraints.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2604.08504

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(7 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.66)

Add feedback

Practical Efficient Global Optimization is No-regret

Wang, Jingyi, Wang, Haowei, Chiang, Nai-Yuan, Mueller, Juliane, Hartland, Tucker, Petra, Cosmin G.

arXiv.org Machine LearningMar-27-2026

Efficient global optimization (EGO) is one of the most widely used noise-free Bayesian optimization algorithms.It comprises the Gaussian process (GP) surrogate model and expected improvement (EI) acquisition function. In practice, when EGO is applied, a scalar matrix of a small positive value (also called a nugget or jitter) is usually added to the covariance matrix of the deterministic GP to improve numerical stability. We refer to this EGO with a positive nugget as the practical EGO. Despite its wide adoption and empirical success, to date, cumulative regret bounds for practical EGO have yet to be established. In this paper, we present for the first time the cumulative regret upper bound of practical EGO. In particular, we show that practical EGO has sublinear cumulative regret bounds and thus is a no-regret algorithm for commonly used kernels including the squared exponential (SE) and Matérn kernels ($ν>\frac{1}{2}$). Moreover, we analyze the effect of the nugget on the regret bound and discuss the theoretical implication on its choice. Numerical experiments are conducted to support and validate our findings.

artificial intelligence, eit 1, machine learning, (13 more...)

arXiv.org Machine Learning

2603.25311

Country:

North America > United States (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > Italy (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

High-Dimensional Gaussian Mean Estimation under Realizable Contamination

Diakonikolas, Ilias, Kane, Daniel M., Pittas, Thanasis

arXiv.org Machine LearningMar-18-2026

We study mean estimation for a Gaussian distribution with identity covariance in $\mathbb{R}^d$ under a missing data scheme termed realizable $ε$-contamination model. In this model an adversary can choose a function $r(x)$ between 0 and $ε$ and each sample $x$ goes missing with probability $r(x)$. Recent work Ma et al., 2024 proposed this model as an intermediate-strength setting between Missing Completely At Random (MCAR) -- where missingness is independent of the data -- and Missing Not At Random (MNAR) -- where missingness may depend arbitrarily on the sample values and can lead to non-identifiability issues. That work established information-theoretic upper and lower bounds for mean estimation in the realizable contamination model. Their proposed estimators incur runtime exponential in the dimension, leaving open the possibility of computationally efficient algorithms in high dimensions. In this work, we establish an information-computation gap in the Statistical Query model (and, as a corollary, for Low-Degree Polynomials and PTF tests), showing that algorithms must either use substantially more samples than information-theoretically necessary or incur exponential runtime. We complement our SQ lower bound with an algorithm whose sample-time tradeoff nearly matches our lower bound. Together, these results qualitatively characterize the complexity of Gaussian mean estimation under $ε$-realizable contamination.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2603.16798

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Locally Linear Continual Learning for Time Series based on VC-Theoretical Generalization Bounds

Ferreira, Yan V. G., Lima, Igor B., S., Pedro H. G. Mapa, Campos, Felipe V., Braga, Antonio P.

arXiv.org Machine LearningMar-17-2026

Most machine learning methods assume fixed probability distributions, limiting their applicability in nonstationary real-world scenarios. While continual learning methods address this issue, current approaches often rely on black-box models or require extensive user intervention for interpretability. We propose SyMPLER (Systems Modeling through Piecewise Linear Evolving Regression), an explainable model for time series forecasting in nonstationary environments based on dynamic piecewise-linear approximations. Unlike other locally linear models, SyMPLER uses generalization bounds from Statistical Learning Theory to automatically determine when to add new local models based on prediction errors, eliminating the need for explicit clustering of the data. Experiments show that SyMPLER can achieve comparable performance to both black-box and existing explainable models while maintaining a human-interpretable structure that reveals insights about the system's behavior. In this sense, our approach conciliates accuracy and interpretability, offering a transparent and adaptive solution for forecasting nonstationary time series.

artificial intelligence, local model, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2026.3672726

2603.13674

Country: