AITopics

2606.28309

Genre:

Research Report (0.64)
Instructional Material (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Neural Information Processing SystemsJun-23-2026, 01:44:34 GMT

Marginal-Nonuniform PACLearnability

We revisit the classical model of nonuniform PAC learning, introduced by Benedek and Itai [1994], where generalization guarantees may depend on the target concept (but not on the marginal distribution). In this work, we study a complementary variant, which we call marginal-nonuniform learning. In this setting, guarantees may depend on the marginal distribution over the domain, but must hold uniformly over all concepts. This captures the intuition that some data distributions are inherently easier to learn from than others, allowing for a flexible, distributionsensitive view of learnability. Our main result is a complete characterization of the achievable learning rates in this model, revealing a trichotomy: exponential rates of the form e n arise precisely when the hypothesis class is finite; linear rates of the form d/n are achievable when a recently introduced combinatorial parameter, the VC-eluder dimension d, is finite; and arbitrarily slow rates may occur when d = . Additionally, in the original (concept-)nonuniform model, we show that for all learnable classes linear rates are achievable. We conclude by situating marginal-nonuniform learning within the landscape of universal learning, and by discussing its relationship to other distribution-dependent learning paradigms.

artificial intelligence, dimension, machine learning, (18 more...)

Country: Europe (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.90)

Neural Information Processing SystemsJun-22-2026, 16:03:12 GMT

Strategic Classification with Non-Linear Classifiers

In strategic classification, the standard supervised learning setting is extended to support the notion of strategic user behavior in the form of costly feature manipulations made in response to a classifier. While standard learning supports a broad range of model classes, the study of strategic classification has, so far, been dedicated mostly to linear classifiers. This work aims to expand the horizon by exploring how strategic behavior manifests under non-linear classifiers and what this implies for learning. We take a bottom-up approach showing how non-linearity affects decision boundary points, classifier expressivity, and model class complexity. Our results show how, unlike the linear case, strategic behavior may either increase or decrease effective class complexity, and that the complexity decrease may be arbitrarily large. Another key finding is that universal approximators (e.g., neural nets) are no longer universal once the environment is strategic. We demonstrate empirically how this can create performance gaps even on an unrestricted model class.

artificial intelligence, decision boundary, machine learning, (18 more...)

Country:

Asia > Middle East > Israel (0.28)
Africa (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.86)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsJun-21-2026, 17:37:31 GMT

On the VC dimension of deep group convolutional neural networks

Equivariant neural networks outperform traditional deep neural networks on a number of tasks. The theoretical understanding of their generalization properties remains, however, limited. In this paper, we analyze the generalization capabilities of Group Convolutional Neural Networks (GCNNs) with ReLU activation function through the lens of Vapnik-Chervonenkis (VC) dimension theory. By deriving upper and lower bounds, we investigate how the network architecture affects the VC dimension.

artificial intelligence, deep learning, machine learning, (19 more...)

Country:

North America > United States (0.67)
Europe (0.67)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsJun-18-2026, 15:39:39 GMT

How to Learn a Star: Binary Classification with Starshaped Polyhedral Sets

We consider binary classification restricted to a class of continuous piecewise linear functions whose decision boundaries are (possibly nonconvex) starshaped polyhedral sets, supported on a fixed polyhedral simplicial fan. We investigate the expressivity of these function classes and describe the combinatorial and geometric structure of the loss landscape, most prominently the sublevel sets, for two loss-functions: the 0/1-loss (discrete loss) and a log-likelihood loss function. In particular, we give explicit bounds on the VC dimension of this model, and concretely describe the sublevel sets of the discrete loss as chambers in a hyperplane arrangement. For the log-likelihood loss, we give sufficient conditions for the optimum to be unique, and describe the geometry of the optimum when varying the rate parameter of the underlying exponential probability distribution.

arrangement, artificial intelligence, machine learning, (20 more...)

Country:

Europe (0.28)
Asia > Japan (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsJun-13-2026, 20:11:00 GMT

On the VC dimension of deep group convolutional neural networks

Recent works have introduced new equivariant neural networks, motivated by their improved generalization compared to traditional deep neural networks. While experiments support this advantage, the theoretical understanding of their generalization properties remains limited. In this paper, we analyze the generalization capabilities of Group Convolutional Neural Networks (GCNNs) with the ReLU activation function through the lens of Vapnik-Chervonenkis (VC) dimension theory. We investigate how architectural factors--such as the number of layers, weights, and input dimensions--affect the VC dimension. A key challenge in our analysis is proving a lower bound on the VC dimension, for which we introduce new techniques, establishing a novel connection between GCNNs and standard deep neural networks. Additionally, we compare our derived bounds to those known for fully connected neural networks. Our results extend previous findings on the VC dimension of continuous GCNNs with two layers, offering new insights into their generalization behavior, particularly their dependence on input resolution.

artificial intelligence, machine learning, proceedings, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Machine LearningJun-5-2026

TinyML-Driven Cybersecurity for Autonomous Spacecraft: Latency-Accuracy Analysis for SPARTA RF and Cyber Threat Detection

Le, Van, Tran, Trevor, Le, Tan

Autonomous spacecraft require rapid, lightweight, and reliable onboard detection of cyber-RF threats. Using the SPARTA attack model, we analyze the latency-accuracy trade-offs of TinyML-compatible classical models -- Random Forest, Logistic Regression, SVM, and MLP -- for detecting uplink jamming, Fake-NR spoofing, payload manipulation, ground-segment compromise, and unauthorized command injection. We present a physics-informed theoretical analysis of each model's computational complexity, VC dimension, Lipschitz continuity, and latency scaling, supported by empirical measurements on adversarial RF spectrograms generated via BandErasure, FakeNR, and NoiseBurst corruption modes. Results show that Logistic Regression achieves microsecond-level inference with only a 1\% accuracy drop relative to Random Forest, making it an effective TinyML baseline for onboard autonomy. The study also identifies opportunities for advancing spacecraft cybersecurity through richer feature encoders and multi-timescale learning architectures, building on recent progress in edge intelligence and trustworthy AI.

artificial intelligence, machine learning, threat, (18 more...)

2606.05779

Country: North America > United States > Virginia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Campbell, Jesse, Ibaibarriaga, Daniel, Reyzin, Lev

Contradiction Graphs Determine VC Dimension

arXiv.org Machine LearningMay-21-2026

The Vapnik-Chervonenkis dimension is the fundamental combinatorial parameter of distribution-free binary classification. Introduced by Vapnik and Chervonenkis in their work on uniform convergence [VC71], and closely connected to the Sauer-Shelah lemma [Sau72, She72], it characterizes classical PAC learnability [Val84, BEHW89, EHKV89]. In particular, finite VC dimension is equivalent to distribution-free learnability. This paper asks whether that finite-versus-infinite VC dichotomy is still visible after replacing a concept class by its contradiction graphs. For a binary class H {0,1}X, the order-m contradiction graph Gm(H) has as vertices the H-realizable labeled samples of length m, with an edge between two samples if they assign opposite labels to some common domain point. Throughout, samples are ordered sequences, and repetitions of domain points are allowed, subject to consistent labeling. We use the contradiction-graph framework introduced by Alon et al. in their graph-theoretic characterization of private learnability [AMSY24]. They ask whether two binary classes can have isomorphic contradiction graphs at every level while one has finite VC dimension and the other has infinite VC dimension.

artificial intelligence, clique, machine learning, (18 more...)

2605.20434

Country: North America > United States > Illinois > Cook County > Chicago (0.40)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Hanneke, Steve, Mehrotra, Anay, Velegkas, Grigoris, Zampetakis, Manolis

What is Learnable in Valiant's Theory of the Learnable?

arXiv.org Machine LearningMay-14-2026

Valiant's 1984 paper is widely credited with introducing the PAC learning model, but it, in fact, introduced a different model: unlike PAC learning, the learner receives only positives, may issue membership queries, and must output a hypothesis with no false positives. Prior work characterized variants, including the case without queries. We revisit Valiant's original model and ask: *Which classes are learnable in it?* For every finite domain, including Valiant's Boolean-hypercube setting, we show that a class is learnable if and only if every realizable positive sample can be certified by a poly-size adaptive query-compression scheme. This is a new variant of sample compression where the learner certifies samples via a short interaction with the membership oracle. Our characterization shows that learnability in Valiant's model is strictly sandwiched between learnability in the PAC model and the variant of Valiant's model without membership queries. This is one of the rare cases where introducing membership queries changes the set of learnable classes, and not just the sample or computational complexity. Next, we study the natural extension of the model to arbitrary domains. While we do not obtain an exact characterization, our techniques readily generalize and show that the same strict sandwiching persists. Finally, we show that $d$-dimensional halfspaces, which are not learnable without queries, are learnable with queries: we give a $\mathrm{poly}(d) \tilde{O}(1/ε)$ sample and $\mathrm{poly}(d) \mathrm{polylog}(1/ε)$ query algorithm, and prove that at least $Ω(d)$ samples or queries are necessary. To our knowledge, this is the first algorithm for halfspaces in Valiant's model. Together, these results uncover a surprisingly rich theory behind Valiant's original notion of learnability and introduce ideas that may be of independent interest in learning theory.

artificial intelligence, machine learning, valiant, (17 more...)

2605.1384

Country: North America > United States (0.27)

Genre: Research Report (1.00)

Industry:

Education (0.45)
Energy (0.45)
Law (0.33)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)

arXiv.org Machine LearningApr-29-2026

Null Measurability at the Symmetrization Interface in VC Learning

Gupta, Dhruv

Recent work revisiting measurability in the fundamental theorem of statistical learning imposes Borel measurability of ghost-gap suprema. We show that, at the one-sided ghost-gap interface actually used by the standard symmetrization proof, this requirement is stronger than necessary. For any Borel-parameterized concept class on a Polish domain, the bad event "there exists a hypothesis whose ghost empirical error exceeds its training empirical error by at least ε/2" is analytic. By Choquet capacitability, it is therefore measurable in the completion of every finite Borel measure. We then construct a concept class whose bad event is null-measurable but not Borel, giving a strict separation from the Borel supremum condition. Finally, we prove closure under patching, fixed and countable interpolation, and fiber-product amalgamation, showing that the weaker regularity level is stable under natural concept-class constructors. In the realizable setting, where targets belong to the class and are measurable, these results weaken the measurability hypothesis needed by the symmetrization route from finite VC dimension to PAC learnability. The main results and the descriptive-set-theoretic infrastructure used by them are formalized in Lean 4.

artificial intelligence, bad event, machine learning, (17 more...)

2604.25028

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)