AITopics | lemma

Collaborating Authors

lemma

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Spectral Methods to Sample Complexity Bounds for Fourier Neural Operators

Chandramoorthy, Nisha, Sanz-Alonso, Daniel, Waniorek, Nathan

arXiv.org Machine LearningJul-2-2026

We establish approximation and learning guarantees for Fourier neural operators (FNOs) applied to time-$T$ solution operators of dissipative evolution equations. The analysis builds on the premise that FNOs can efficiently approximate and learn solution operators whenever these operators admit stable and accurate spectral discretizations. To formalize this idea, we introduce classes of evolution operators defined through spectral methods and derive FNO approximation bounds and polynomial sample complexity guarantees for these classes. For equations with polynomial nonlinearities, the learning rates depend primarily on the smoothness of the input space and the dimension of the physical domain. Our results hold uniformly over broad families of dissipative equations, rather than for a single fixed PDE, and apply in particular to the Navier--Stokes, Allen--Cahn, and Cahn--Hilliard equations. For equations with non-polynomial smooth nonlinearities, we prove that polynomial sample complexity still holds with rates that now additionally depend on the smoothness of the nonlinear terms and the dissipation strength. Overall, we connect classical spectral approximation theory with modern operator learning and explain when FNOs can learn nonlinear evolution operators efficiently.

artificial intelligence, machine learning, operator, (18 more...)

arXiv.org Machine Learning

2607.0032

Country: North America > United States > Illinois > Cook County > Chicago (0.40)

Genre: Research Report (0.64)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

SGD Provably Prioritizes a Shortcut Spurious Feature in the XOR Model

LaBonte, Tyler, Muthukumar, Vidya

arXiv.org Machine LearningJun-30-2026

Neural networks are known to be susceptible to over-reliance on spurious correlations. However, the precise mechanism by which models exploit shortcut features is not fully understood, and algorithms to mitigate this behavior rely on as yet unjustified assumptions about the learned representations. In this work, we provide the first end-to-end theoretical characterization of spurious feature learning for two-layer ReLU neural networks trained by online minibatch SGD on the logistic loss. We consider data drawn from the high-dimensional Boolean hypercube with a quadratic signal function (namely XOR) and a linear spurious correlation. We show that SGD learns the spurious feature first, and exponentially fast. Moreover, the optimization dynamics couple the spurious and signal features, with a stronger spurious component inhibiting signal feature learning. Our analysis reveals precise phase transitions in the learning dynamics. In the first phase, alignment between the signs of the spurious feature and second-layer weight drives rapid growth of the spurious feature. In the second phase, large majority group margin slows learning and the signal feature remains suppressed. When the spurious correlation is maximally strong, we show theoretically that the spurious feature dominates even at the sample complexity threshold where XOR would be learned in isolation (i.e., if the spurious feature was absent). In contrast, when the correlation strength is constant, we provide preliminary empirical evidence that the model can eventually learn the XOR signal, although the spurious feature is not forgotten.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Machine Learning

2606.30444

Genre: Research Report (0.50)

Industry: Health & Medicine (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

Self-Organized Conformal Prediction: Reducing Regional Coverage Gaps with Unsupervised Group Discovery

Berthier, Louis, Shokry, Ahmed, Moreaud, Maxime, Ramelet, Guillaume, Dieuleveut, Aymeric

arXiv.org Machine LearningJun-30-2026

Conformal prediction guarantees marginal coverage, but pooled calibration averages over heterogeneous regions and can mask regional undercoverage in safety-critical subgroups. We introduce Self-Organized Conformal Prediction (SOCP), a calibration scheme that discovers input-space groups with a Self-Organizing Map (SOM) and, at test time, draws a local calibration buffer from the query's best-matching unit (BMU) cell or a fixed grid neighborhood. The same retrieval rule applies to regression and classification tasks across tabular features and image embeddings, leaving the predictor and nonconformity score untouched. SOCP gives exact validity for BMU-cell retrieval and fixed retrieved-set validity for neighborhood buffers; central-cell validity for neighborhood retrieval holds up to a Kolmogorov-Smirnov (KS) bias term. A split-routed extension recovers fixed retrieved-set validity conditional on the routing split. On eight regression and classification benchmarks, SO-SCP reduces the weighted regional coverage gap on $7/8$ datasets (mean paired change $-7.1\%$) for a mean prediction-set size increase of $6.2\%$, with negligible overhead on the largest six datasets; SO-CQR yields smaller gains, since quantile regression already absorbs much of the heterogeneity. By learning groups directly from the input geometry, SOCP provides group-local calibration with exact fixed-group guarantees and approximate central-cell guarantees, without supervised partitions or predictor retraining.

artificial intelligence, machine learning, threshold, (17 more...)

arXiv.org Machine Learning

2606.29403

Country: North America > United States (0.29)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Ultrametric Cluster Hierarchies: IWant'em All!

Neural Information Processing SystemsJun-23-2026, 12:01:31 GMT

Hierarchical clustering is a powerful tool for exploratory data analysis, organizing data into a tree of clusterings from which a partition can be chosen. This paper generalizes these ideas by proving that, for any reasonable hierarchy, one can optimally solve any center-based clustering objective over it (such as k-means). Moreover, these solutions can be found exceedingly quickly and are themselves necessarily hierarchical. Thus, given a cluster tree, we show that one can quickly access a plethora of new, equally meaningful hierarchies. Just as in standard hierarchical clustering, one can then choose any desired partition from these new hierarchies. We conclude by verifying the utility of our proposed techniques across datasets, hierarchies, and partitioning schemes.

data mining, machine learning, node, (17 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Small Resamples, Sharp Guarantees: Convergence Rates for Resampled Studentized Quantile Estimators

Neural Information Processing SystemsJun-23-2026, 07:46:05 GMT

The m-out-of-n bootstrap--proposed by Bickel et al. [1992]--approximates the distribution of a statistic by repeatedly drawing msubsamples (m n) without replacement from an original sample of size n; it is now routinely used for robust inference with heavy-tailed data, bandwidth selection, and other large-sample applications. Despite this broad applicability across econometrics, biostatistics, and machine-learning workflows, rigorous parameter-free guarantees for the soundness of the m-out-of-n bootstrap when estimating sample quantiles have remained elusive. This paper establishes such guarantees by analysing the estimator of sample quantiles obtained from m-out-of-n resampling of a dataset of length n. We first prove a central limit theorem for a fully data-driven version of the estimator that holds under a mild moment condition and involves no unknown nuisance parameters. We then show that the moment assumption is essentially tight by constructing a counter-example in which the CLT fails. Strengthening the assumptions slightly, we derive an Edgeworth expansion that delivers exact convergence rates and, as a corollary, a Berry-Esséen bound on the bootstrap approximation error. Finally, we illustrate the scope of our results by obtaining parameter-free asymptotic distributions for practical statistics, including the quantiles for random walk MH, and rewards of ergodic MDP's, thereby demonstrating the usefulness of our theory in modern estimation and learning tasks.

artificial intelligence, machine learning, markov chain, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education (0.67)
Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Discovering Opinion Intervals from Conflicts in Signed Graphs

Peter Blohm, Florian Chen, Aristides Gionis, Stefan Neumann

Neural Information Processing SystemsJun-23-2026, 04:21:17 GMT

Online social media provide a platform for people to discuss current events and exchange opinions with their peers. While interactions are predominantly positive, in recent years, there has been a lot of research to understand the conflicts in social networks and how they are based on different views and opinions. In this paper, we ask whether the conflicts in a network reveal a small and interpretable set of prevalent opinion ranges that explain the users' interactions. More precisely, we consider signed graphs, where the edge signs indicate positive and negative interactions of node pairs, and our goal is to infer opinion intervals that are consistent with the edge signs.

artificial intelligence, machine learning, vertex, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.67)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry:

Government (0.93)
Information Technology > Services (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Private Online Learning against an Adaptive Adversary: Realizable and Agnostic Settings

Neural Information Processing SystemsJun-23-2026, 02:26:32 GMT

We revisit the problem of private online learning, in which a learner receives a sequence of T data points and has to respond at each time-step a hypothesis. It is required that the entire stream of output hypotheses should satisfy differential privacy. Prior work of Golowich and Livni [2021] established that every concept class H with finite Littlestone dimension d is privately online learnable in the realizable setting. In particular, they proposed an algorithm that achieves an Od(logT) mistake bound against an oblivious adversary. However, their approach yields a suboptimal Od( T) bound against an adaptive adversary. In this work, we present a new algorithm with a mistake bound of Od(logT)against an adaptive adversary, closing this gap. We further investigate the problem in the agnostic setting, which is more general than the realizable setting as it does not impose any assumptions on the data. We give an algorithm that obtains a sublinear regret of Od( T) for generic Littlestone classes, demonstrating that they are also privately online learnable in the agnostic setting.

artificial intelligence, learner, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)

Add feedback

f0156a82b6af6a4e838923ce9c124424-Paper-Conference.pdf

Neural Information Processing SystemsJun-23-2026, 02:12:07 GMT

Structure-agnostic causal inference studies how well one can estimate a treatment effect given black-box machine learning estimates of nuisance functions (like the impact of confounders on treatment and outcomes). Here, we find that the answer depends in a surprising way on the distribution of the treatment noise. Focusing on the partially linear model of Robinson [1988], we first show that the widely adopted double machine learning (DML) estimator is minimax rate-optimal for Gaussian treatment noise, resolving an open problem of Mackey et al. [2018]. Meanwhile, for independent non-Gaussian treatment noise, we show that DML is always suboptimal by constructing new practical procedures with higher-order robustness to nuisance errors. These ACE procedures use structure-agnostic cumulant estimators to achieve r-th order insensitivity to nuisance errors whenever the (r + 1)-st treatment cumulant is non-zero. We complement these core results with novel minimax guarantees for binary treatments in the partially linear model. Finally, using synthetic demand estimation experiments, we demonstrate the practical benefits of our higher-order robust estimators.

artificial intelligence, assumption, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

The Complexity of Symmetric Equilibria in Min-Max Optimization and Team Zero-Sum Games

Neural Information Processing SystemsJun-23-2026, 02:01:59 GMT

We consider the problem of computing stationary points in min-max optimization, with a focus on the special case of Nash equilibria in (two-)team zero-sum games. We first show that computing ϵ-Nash equilibria in 3-player adversarial team games--wherein a team of 2players competes against a single adversary-- is CLS-complete, resolving the complexity of Nash equilibria in such settings. Our proof proceeds by reducing from symmetric ϵ-Nash equilibria in symmetric, identical-payoff, two-player games, by suitably leveraging the adversarial player so as to enforce symmetry--without disturbing the structure of the game. In particular, the class of instances we construct comprises solely polymatrix games, thereby also settling a question left open by Hollender, Maystre, and Nagarajan (2024). Moreover, we establish that computing symmetric (first-order) equilibria in symmetric min-max optimization is PPAD-complete, even for quadratic functions. Building on this reduction, we show that computing symmetric ϵ-Nash equilibria in symmetric, 6-player (3 vs. 3) team zero-sum games is also PPAD-complete, even for ϵ = poly(1/n). As a corollary, this precludes the existence of symmetric dynamics--which includes many of the algorithms considered in the literature-- converging to stationary points. Finally, we prove that computing a non-symmetric poly(1/n)-equilibrium in symmetric min-max optimization is FNP-hard.

equilibrium, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Low-Precision Streaming PCA

Neural Information Processing SystemsJun-23-2026, 00:51:54 GMT

Low-precision Streaming PCA estimates the top principal component in a streaming setting under limited precision. We establish an information-theoretic lower bound on the quantization resolution required to achieve a target accuracy for the leading eigenvector. We study Oja's algorithm for streaming PCA under linear and nonlinear stochastic quantization. The quantized variants use unbiased stochastic quantization of the weight vector and the updates. Under mild moment and spectral-gap assumptions on the data distribution, we show that a batched version achieves the lower bound up to logarithmic factors under both schemes. This leads to a nearly dimension-free quantization error in the nonlinear quantization setting. Empirical evaluations on synthetic streams validate our theoretical findings and demonstrate that our low-precision methods closely track the performance of standard Oja's algorithm.

artificial intelligence, machine learning, quantization, (17 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback