AITopics | curranassociate

Collaborating Authors

curranassociate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Calibrating Scientific Foundation Models with Inference-Time Stochastic Attention

Yadav, Akash, Adebiyi, Taiwo A., Zhang, Ruda

arXiv.org Machine LearningApr-22-2026

Transformer-based scientific foundation models are increasingly deployed in high-stakes settings, but current architectures give deterministic outputs and provide limited support for calibrated predictive uncertainty. We propose Stochastic Attention, a lightweight inference-time modification that randomizes attention by replacing softmax weights with normalized multinomial samples controlled by a single concentration parameter, and produces predictive ensembles without retraining. To set this parameter, we introduce a calibration objective that matches the stochastic attention output with the target, yielding an efficient univariate post-hoc tuning problem. We evaluate this mechanism on two scientific foundation models for weather and timeseries forecasting along with an additional regression task. Across benchmarks against uncertainty-aware baselines, we find that Stochastic Attention achieves the strongest native calibration and the sharpest prediction intervals at comparable coverage, while requiring only minutes of post-hoc tuning versus days of retraining for competitive baselines.

artificial intelligence, calibration, machine learning, (18 more...)

arXiv.org Machine Learning

2604.1953

Country:

North America > United States > Texas > Harris County > Houston (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Ordinary Least Squares is a Special Case of Transformer

Tan, Xiaojun, Zhao, Yuchen

arXiv.org Machine LearningApr-16-2026

The statistical essence of the Transformer architecture has long remained elusive: Is it a universal approximator, or a neural network version of known computational algorithms? Through rigorous algebraic proof, we show that the latter better describes Transformer's basic nature: Ordinary Least Squares (OLS) is a special case of the single-layer Linear Transformer. Using the spectral decomposition of the empirical covariance matrix, we construct a specific parameter setting where the attention mechanism's forward pass becomes mathematically equivalent to the OLS closed-form projection. This means attention can solve the problem in one forward pass, not by iterating. Building upon this prototypical case, we further uncover a decoupled slow and fast memory mechanism within Transformers. Finally, the evolution from our established linear prototype to standard Transformers is discussed. This progression facilitates the transition of the Hopfield energy function from linear to exponential memory capacity, thereby establishing a clear continuity between modern deep architectures and classical statistical inference.

artificial intelligence, machine learning, transformer, (17 more...)

arXiv.org Machine Learning

2604.13656

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

Dataset Distillation Efficiently Encodes Low-Dimensional Representations from Gradient-Based Learning of Non-Linear Tasks

Kinoshita, Yuri, Nishikawa, Naoki, Toyoizumi, Taro

arXiv.org Machine LearningMar-31-2026

Dataset distillation, a training-aware data compression technique, has recently attracted increasing attention as an effective tool for mitigating costs of optimization and data storage. However, progress remains largely empirical. Mechanisms underlying the extraction of task-relevant information from the training process and the efficient encoding of such information into synthetic data points remain elusive. In this paper, we theoretically analyze practical algorithms of dataset distillation applied to the gradient-based training of two-layer neural networks with width $L$. By focusing on a non-linear task structure called multi-index model, we prove that the low-dimensional structure of the problem is efficiently encoded into the resulting distilled data. This dataset reproduces a model with high generalization ability for a required memory complexity of $\tildeΘ$$(r^2d+L)$, where $d$ and $r$ are the input and intrinsic dimensions of the task. To the best of our knowledge, this is one of the first theoretical works that include a specific task structure, leverage its intrinsic dimensionality to quantify the compression rate and study dataset distillation implemented solely via gradient-based algorithms.

artificial intelligence, machine learning, sd 1, (18 more...)

arXiv.org Machine Learning

2603.1483

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Latent Ordinary Differential Equations for Irregularly-Sampled Time Series

Yulia Rubanova, Tian Qi Chen, David K. Duvenaud

Neural Information Processing SystemsFeb-19-2026, 12:01:06 GMT

Time series with non-uniform intervals occur in many applications, and are difficulttomodel usingstandard recurrent neural networks(RNNs).

artificial intelligence, latentode, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine (0.95)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

AdaptiveProximalGradientMethodsforStructured NeuralNetworks

Neural Information Processing SystemsFeb-19-2026, 09:12:23 GMT

The technique of regularization is ubiquitous in machine learning as it can effectively prevent overfitting and yield better generalization.

artificial intelligence, machine learning, regularization, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

dimension

Neural Information Processing SystemsFeb-19-2026, 08:33:51 GMT

We study first-order optimization algorithms for computing the barycenter of Gaussian distributions with respect to the optimal transport metric.

artificial intelligence, barycenter, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

79121bb953a3bd47c076f20234bafd2e-Supplemental.pdf

Neural Information Processing SystemsFeb-19-2026, 04:33:01 GMT

Inrecentyears,quantitativestatistical and algorithmic treatments of these formulations have produced insights into modern computational methods-resulting innovelapproaches todifficult, open problems.

artificial intelligence, curranassociate, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

LatentMatters: LearningDeepState-SpaceModels

Neural Information Processing SystemsFeb-19-2026, 02:42:45 GMT

Deep state-space models (DSSMs) enable temporal predictions by learning the underlying dynamics ofobservedsequence data.

artificial intelligence, machine learning, zt 1, (19 more...)

Neural Information Processing Systems

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Sparsified SGD with Memory

Sebastian U. Stich, Jean-Baptiste Cordonnier, Martin Jaggi

Neural Information Processing SystemsFeb-14-2026, 06:08:58 GMT

Several papers consider approaches that limit the number of bits to represent floating point numbers [13, 24, 31]. Recent work proposes adaptive tuning of the compression ratio [7].

artificial intelligence, conference, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

Band-LimitedGaussianProcesses: TheSincKernel

Neural Information Processing SystemsFeb-14-2026, 04:09:07 GMT

In addition to its use in regression, the relationship between the sinc kernel and the classic theory is illuminated, in particular, the Shannon-Nyquist theorem is interpreted as posterior reconstruction under the proposed kernel.

artificial intelligence, kernel, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback