AITopics | definition 2

Collaborating Authors

definition 2

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Leave a Window Out: Modifying the Jackknife for Predictive Inference in Time Series

Jiang, Hanyang, Barber, Rina Foygel, Pananjady, Ashwin, Xie, Yao

arXiv.org Machine LearningMay-29-2026

Conformal prediction methods enjoy strong theoretical and empirical predictive inference performance, provided the data is exchangeable, and predictors are trained in a memoryless fashion. However, these assumptions and constraints are impractical in many real-data settings, such as time series (where temporal dependence violates exchangeability, and where memoryless predictors will inevitably have poor predictive accuracy). Recent work shows that the split conformal prediction method is robust to these issues of memory-based predictors and deviations from exchangeability that are common features of time-series data. However, since using sample splitting can lead to lower accuracy, this motivates asking whether other predictive inference methods (that do not rely on data splitting) could also be reliably used in the time series setting. In this work, we show that the vanilla leave-one-out jackknife can suffer an arbitrary loss of coverage even in canonical time series models with mild temporal dependence. As a remedy, we propose a careful modification tailored to such settings, which we term the \emph{leave-a-window-out} (LWO) method, and show that it can achieve valid coverage provided that the model-fitting procedure satisfies mild stability properties. Our proofs are based on quantifying the degree to which the data departs from \emph{cyclic exchangeability}, and we introduce new coefficients to measure the extent of this departure. Experiments on time series data demonstrate that our LWO method often enjoys valid coverage when the vanilla jackknife fails to cover, while producing much narrower intervals than split conformal prediction.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

2605.30292

Country: North America > United States (0.67)

Genre: Research Report (1.00)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
(2 more...)

Add feedback

Gaussian Processes with Sample Paths in Reproducing Kernel Banach Spaces

Karvonen, Toni, Sørensen, Rasmus Kleist Hørlyck

arXiv.org Machine LearningMay-28-2026

We investigate the connection between Gaussian processes and Gaussian random elements in reproducing kernel Banach spaces. We show that the covariance operator of a weak second-order Radon probability measure on such a space is uniquely determined by a positive definite function. In the Gaussian case, we characterize those positive definite functions that arise from covariance operators in terms of $γ$-radonifying operators. Building on these results, we extend the classical Driscoll theorem to the Banach space setting.

artificial intelligence, machine learning, operator, (17 more...)

arXiv.org Machine Learning

2605.28106

Country:

North America > United States (0.28)
Europe > Finland (0.28)
Europe > Denmark (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Computational aspects of the Volterra Signature

Hager, Paul P., Harang, Fabian N., Pelizzari, Luca, Tindel, Samy

arXiv.org Machine LearningMay-19-2026

The Volterra signature extends the classical path signature by incorporating general matrix-valued kernel into its iterated integral structure, yielding a flexible notion of memory for time series. Its components can be viewed as successive Picard iterates of linear controlled Volterra equations, making their exact computation of additional mathematical interest. However, the kernel introduces substantial algorithmic challenges. We provide a resolution by first decomposing the Chen-type convolution relation established in [13] into analytic and arithmetic parts, and then introducing several efficient algorithms: a general approximative scheme with quadratic complexity O(J2) in the number of time steps J, an FFT-based acceleration with complexity O(J logJ) for convolution kernels on uniform grids, and an exact recursion with complexity O(JR2) for kernels admitting a state-space representation of dimension R; retaining standard signature complexity in the path dimension and truncation level N. We further show that the number of factors in matrix-valued kernels of the form K(t,s) = P p kp(t s)Ap do not increase the asymptotic complexity in J and N. Finally, we derive a finite-difference predictor-corrector scheme for the associated Volterra signature kernel. All algorithms are implemented in the publicly available JAX-based package tensordev.

artificial intelligence, machine learning, mathematics of computing, (19 more...)

arXiv.org Machine Learning

2605.18406

Country:

Europe (0.92)
North America > United States (0.92)

Genre: Research Report (0.63)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Statistical Limits and Efficient Algorithms for Differentially Private Federated Learning

Auddy, Arnab, Peng, Xiangni, Paul, Subhadeep

arXiv.org Machine LearningMay-19-2026

Federated Learning is a leading framework for training ML and AI models collaboratively across numerous user devices or databases. We study the trade-offs among estimation accuracy, privacy constraints, and communication cost for differentially private (DP) federated M estimation. The two standard methods in the literature are FedAvg, which may suffer from high federation bias, and FedSGD, which can incur high communication cost. Aimed at improving accuracy at a reduced communication cost, we propose FedHybrid, which uses FedSGD starting with an improved initialization by the FedAvg estimator. We propose FedNewton, which averages local Newton iterations to reduce bias in FedAvg, achieving an estimation accuracy comparable to FedSGD with much fewer communication rounds when the number of clients grows sufficiently slowly. We establish finite sample upper bounds on the mean-squared error rates of the DP versions of these estimators as functions of the number of clients, local sample sizes, privacy budget, and number of iterations. We further derive a minimax lower bound on the MSE of any iterative private federated procedure that provides a benchmark to assess the optimality gap of these methods. We numerically evaluate our methods for training a logistic regression and a neural network on the computer vision datasets MNIST and CIFAR-10.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Machine Learning

2605.18656

Country: North America > United States > Ohio (0.40)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

On Hallucinations in Inverse Problems: Fundamental Limits and Provable Assessment Methods

Iagaru, David, Gottschling, Nina M., Hansen, Anders C., Garnier, Josselin

arXiv.org Machine LearningMay-14-2026

While deep learning has revolutionised inverse problems, its safe deployment is hindered by three primary reliability concerns: hallucinations, instabilities, and performance volatility [48]. Hallucinations manifest as high-fidelity features that are factually false; instabilities reflect heightened sensitivity to measurement noise; and performance volatility refers to significant fluctuations in reconstruction quality across the data, yielding high-fidelity results for some samples while failing on seemingly similar images. In many applications, the risk of generating realistic but unfaithful content can impede the safe deployment of AI methods for inverse problems. The choice of "hallucinate" as the Cambridge Dictionary's word of the year in 2023 illustrates this open problem [53]. The problem of AI hallucinations persists, as the Financial Times [44] highlighted that, "AI hallucinations haunt users more than job losses." A first step toward training AI methods that do not suffer from hallucinations is the assessment and identification of hallucinated outputs. Consider the inverse problem of recovering xfrom noisy measurements y " Fpx,eq, x PM1 ĂX, e PEĂY, (1.1)

artificial intelligence, machine learning, reconstruction, (18 more...)

arXiv.org Machine Learning

2605.13146

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Energy (1.00)
(2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Stable Distance Persistence Homology for Dynamic Bayesian Network Clustering

Bales, Will, Rovi, Carmen

arXiv.org Machine LearningMay-13-2026

Dynamic Bayesian networks (DBNs) are a widely used framework for modeling systems whose probabilistic structure evolves over time. Standard inference methods focus on local conditional distributions and can miss larger-scale patterns in how dependencies between variables organize and change over time. We introduce a topological approach to this problem. To each DBN we associate a time-varying graph, called a Dynamic Bayesian Graph (DBG), by assigning to each edge a strength that measures variation in its conditional dependence across parent configurations, and retaining edges whose strength exceeds a chosen threshold. We show that this construction fits within the dynamic graph framework of Kim and Mémoli, enabling the use of tools from topological data analysis. Applying persistent homology to a DBG produces a barcode, which records the merging and disappearance of connected groups of strongly dependent variables over time. We prove that this barcode is stable: small perturbations in the conditional probability tables of the DBN lead to small changes in the resulting barcode. This yields a principled and noise-resistant summary of how dependency structure evolves in a dynamic Bayesian network.

artificial intelligence, definition 1, machine learning, (15 more...)

arXiv.org Machine Learning

2605.11226

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

fcd3909db30887ce1da519c4468db668-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 09:54:54 GMT

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (0.68)

Industry:

Banking & Finance (0.67)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Game Theory (0.68)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

fcd3909db30887ce1da519c4468db668-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 09:54:50 GMT

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.92)
Europe (0.67)

Genre: Research Report (0.68)

Industry:

Banking & Finance (0.67)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Game Theory (0.68)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

How to Scale Your EMA

Neural Information Processing SystemsApr-30-2026, 03:38:07 GMT

Preserving training dynamics across batch sizes is an important tool for practical machine learning as it enables the trade-off between batch size and wall-clock time. This trade-off is typically enabled by a scaling rule, for example, in stochastic gradient descent, one should scale the learning rate linearly with the batch size. Another important machine learning tool is the model EMA, a functional copy of a target model, whose parameters move towards those of its target model according to an Exponential Moving Average (EMA) at a rate parameterized by a momentum hyperparameter. This model EMA can improve the robustness and generalization of supervised learning, stabilize pseudo-labeling, and provide a learning signal for Self-Supervised Learning (SSL). Prior works have not considered the optimization of the model EMA when performing scaling, leading to different training dynamics across batch sizes and lower model performance. In this work, we provide a scaling rule for optimization in the presence of a model EMA and demonstrate the rule's validity across a range of architectures, optimizers, and data modalities. We also show the rule's validity where the model EMA contributes to the optimization of the target model, enabling us to train EMA-based pseudo-labeling and SSL methods at small and large batch sizes. For SSL, we enable training of BYOL up to batch size 24,576 without sacrificing performance, a 6 wall-clock time reduction under idealized hardware settings.

artificial intelligence, emascaling rule, machine learning, (17 more...)

Neural Information Processing Systems

Country: