AITopics | px 0

Collaborating Authors

px 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

fc65fab891d83433bd3c8d966edde311-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 03:01:25 GMT

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Add feedback

Stability and Accuracy Trade-offs in Statistical Estimation

Chakraborty, Abhinav, Luo, Yuetian, Barber, Rina Foygel

arXiv.org Machine LearningJan-21-2026

Algorithmic stability is a central concept in statistics and learning theory that measures how sensitive an algorithm's output is to small changes in the training data. Stability plays a crucial role in understanding generalization, robustness, and replicability, and a variety of stability notions have been proposed in different learning settings. However, while stability entails desirable properties, it is typically not sufficient on its own for statistical learning -- and indeed, it may be at odds with accuracy, since an algorithm that always outputs a constant function is perfectly stable but statistically meaningless. Thus, it is essential to understand the potential statistical cost of stability. In this work, we address this question by adopting a statistical decision-theoretic perspective, treating stability as a constraint in estimation. Focusing on two representative notions-worst-case stability and average-case stability-we first establish general lower bounds on the achievable estimation accuracy under each type of stability constraint. We then develop optimal stable estimators for four canonical estimation problems, including several mean estimation and regression settings. Together, these results characterize the optimal trade-offs between stability and accuracy across these tasks. Our findings formalize the intuition that average-case stability imposes a qualitatively weaker restriction than worst-case stability, and they further reveal that the gap between these two can vary substantially across different estimation problems.

artificial intelligence, machine learning, stability, (18 more...)

arXiv.org Machine Learning

2601.11701

Country: North America > United States (0.45)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Leveraging Equivariances and Symmetries in the Control Barrier Function Synthesis

Wiltz, Adrian, Dimarogonas, Dimos V.

arXiv.org Artificial IntelligenceSep-5-2025

The synthesis of Control Barrier Functions (CBFs) often involves demanding computations or a meticulous construction. However, structural properties of the system dynamics and constraints have the potential to mitigate these challenges. In this paper, we explore how equivariances in the dynamics, loosely speaking a form of symmetry, can be leveraged in the CBF synthesis. Although CBFs are generally not inherently symmetric, we show how equivariances in the dynamics and symmetries in the constraints induce symmetries in CBFs derived through reachability analysis. This insight allows us to infer their CBF values across the entire domain from their values on a subset, leading to significant computational savings. Interestingly, equivariances can be even leveraged to the CBF synthesis for non-symmetric constraints. Specifically, we show how a partially known CBF can be leveraged together with equivariances to construct a CBF for various new constraints. Throughout the paper, we provide examples illustrating the theoretical findings. Furthermore, a numerical study investigates the computational gains from invoking equivariances into the CBF synthesis.

artificial intelligence, constraint, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2509.04399

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Artificial Intelligence > Machine Learning (0.67)

Add feedback

Learning non-equilibrium diffusions with Schrödinger bridges: from exactly solvable to simulation-free

Zhang, Stephen Y., Stumpf, Michael P H

arXiv.org Machine LearningMay-23-2025

We consider the Schrödinger bridge problem which, given ensemble measurements of the initial and final configurations of a stochastic dynamical system and some prior knowledge on the dynamics, aims to reconstruct the "most likely" evolution of the system compatible with the data. Most existing literature assume Brownian reference dynamics and are implicitly limited to potential-driven dynamics. We depart from this regime and consider reference processes described by a multivariate Ornstein-Uhlenbeck process with generic drift matrix $\mathbf{A} \in \mathbb{R}^{d \times d}$. When $\mathbf{A}$ is asymmetric, this corresponds to a non-equilibrium system with non-conservative forces at play: this is important for applications to biological systems, which are naturally exist out-of-equilibrium. In the case of Gaussian marginals, we derive explicit expressions that characterise the solution of both the static and dynamic Schrödinger bridge. For general marginals, we propose mvOU-OTFM, a simulation-free algorithm based on flow and score matching for learning the Schrödinger bridge. In application to a range of problems based on synthetic and real single cell data, we demonstrate that mvOU-OTFM achieves higher accuracy compared to competing methods, whilst being significantly faster to train.

artificial intelligence, machine learning, px 0, (17 more...)

arXiv.org Machine Learning

2505.16644

Genre: Research Report (0.64)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Revisiting Optimism and Model Complexity in the Wake of Overparameterized Machine Learning

Patil, Pratik, Du, Jin-Hong, Tibshirani, Ryan J.

arXiv.org Machine LearningOct-2-2024

Common practice in modern machine learning involves fitting a large number of parameters relative to the number of observations. These overparameterized models can exhibit surprising generalization behavior, e.g., ``double descent'' in the prediction error curve when plotted against the raw number of model parameters, or another simplistic notion of complexity. In this paper, we revisit model complexity from first principles, by first reinterpreting and then extending the classical statistical concept of (effective) degrees of freedom. Whereas the classical definition is connected to fixed-X prediction error (in which prediction error is defined by averaging over the same, nonrandom covariate points as those used during training), our extension of degrees of freedom is connected to random-X prediction error (in which prediction error is averaged over a new, random sample from the covariate distribution). The random-X setting more naturally embodies modern machine learning problems, where highly complex models, even those complex enough to interpolate the training data, can still lead to desirable generalization performance under appropriate conditions. We demonstrate the utility of our proposed complexity measures through a mix of conceptual arguments, theory, and experiments, and illustrate how they can be used to interpret and compare arbitrary prediction models.

freedom, random-x degree, regression, (17 more...)

arXiv.org Machine Learning

2410.01259

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre:

Research Report (0.50)
Overview (0.45)

Industry: Education (0.34)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

A Low Rank Neural Representation of Entropy Solutions

Rim, Donsub, Welper, Gerrit

arXiv.org Artificial IntelligenceJun-9-2024

We construct a new representation of entropy solutions to nonlinear scalar conservation laws with a smooth convex flux function in a single spatial dimension. The representation is a generalization of the method of characteristics and posseses a compositional form. While it is a nonlinear representation, the embedded dynamics of the solution in the time variable is linear. This representation is then discretized as a manifold of implicit neural representations where the feedforward neural network architecture has a low rank structure. Finally, we show that the low rank neural representation with a fixed number of layers and a small number of coefficients can approximate any entropy solution regardless of the complexity of the shock topology, while retaining the linearity of the embedded dynamics.

approximation, artificial intelligence, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2406.05694

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mamba State-Space Models Can Be Strong Downstream Learners

Halloran, John T., Gulati, Manbir, Roysdon, Paul F.

arXiv.org Artificial IntelligenceMay-31-2024

Mamba state-space models (SSMs) have recently outperformed state-of-the-art (SOTA) Transformer large language models (LLMs) in various tasks and been widely adapted. However, Mamba's downstream learning capabilities remain either unexplored$\unicode{x2013}$e.g., mixed-precision (MPFT) and parameter-efficient fine-tuning (PEFT)--or under-evaluated$\unicode{x2013}$e.g., in-context learning (ICL). For the latter, recent works reported Mamba's ICL rivals SOTA Transformer LLMs using non-standard benchmarks. In contrast, we show that on standard benchmarks, pretrained Mamba models achieve only 38% of the ICL performance improvements (over zero-shot) of comparable Transformers. Enabling MPFT and PEFT in Mamba architectures is challenging due to recurrent dynamics and highly customized CUDA kernels, respectively. However, we prove that Mamba's recurrent dynamics are robust to small input changes using dynamical systems theory. Empirically, we show that performance changes in Mamba's inference and fine-tuning due to mixed-precision align with Transformer LLMs. Furthermore, we show that targeting key memory buffers in Mamba's customized CUDA kernels for low-rank adaptation regularizes SSM parameters, thus achieving parameter efficiency while retaining speedups. We show that combining MPFT and PEFT enables up to 2.15 times more tokens-per-second and 65.5% reduced per-token-memory compared to full Mamba fine-tuning, while achieving up to 81.5% of the ICL performance improvements (over zero-shot) of comparably fine-tuned Transformers.

arxiv preprint arxiv, fine-tuning, mamba, (14 more...)

arXiv.org Artificial Intelligence

2406.00209

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Local Observability of VINS and LINS

Li, Xinran

arXiv.org Artificial IntelligenceApr-10-2024

Under the assumption that there exist two features observed by the camera without occlusion, the unobservable directions of VINS are uniformly globally translation and global rotations about the gravity vector. The unobservable directions of LINS are same as VINS, while only one feature need to be observed. Also, a constraint in Observability-Constrained VINS (OC-VINS) is proved.

bs 0 0 0, ffi, ffi 0, (14 more...)

arXiv.org Artificial Intelligence

2404.00066

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Minnesota > Ramsey County > Saint Paul (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

Statistical Inference of Optimal Allocations I: Regularities and their Implications

Feng, Kai, Hong, Han

arXiv.org Machine LearningApr-7-2024

In this paper, we develop a functional differentiability approach for solving statistical optimal allocation problems. We first derive Hadamard differentiability of the value function through a detailed analysis of the general properties of the sorting operator. Central to our framework are the concept of Hausdorff measure and the area and coarea integration formulas from geometric measure theory. Building on our Hadamard differentiability results, we demonstrate how the functional delta method can be used to directly derive the asymptotic properties of the value function process for binary constrained optimal allocation problems, as well as the two-step ROC curve estimator. Moreover, leveraging profound insights from geometric functional analysis on convex and local Lipschitz functionals, we obtain additional generic Fr\'echet differentiability results for the value functions of optimal allocation problems. These compelling findings motivate us to study carefully the first order approximation of the optimal social welfare. In this paper, we then present a double / debiased estimator for the value functions. Importantly, the conditions outlined in the Hadamard differentiability section validate the margin assumption from the statistical classification literature employing plug-in methods that justifies a faster convergence rate.

pxq, statistical inference, theorem 3, (14 more...)

arXiv.org Machine Learning

2403.18248

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report (0.49)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

Towards a mathematical theory for consistency training in diffusion models

Li, Gen, Huang, Zhihan, Wei, Yuting

arXiv.org Artificial IntelligenceFeb-12-2024

Consistency models, which were proposed to mitigate the high computational overhead during the sampling phase of diffusion models, facilitate single-step sampling while attaining state-of-the-art empirical performance. When integrated into the training phase, consistency models attempt to train a sequence of consistency functions capable of mapping any point at any time step of the diffusion process to its starting point. Despite the empirical success, a comprehensive theoretical understanding of consistency training remains elusive. This paper takes a first step towards establishing theoretical underpinnings for consistency models. We demonstrate that, in order to generate samples within $\varepsilon$ proximity to the target in distribution (measured by some Wasserstein metric), it suffices for the number of steps in consistency learning to exceed the order of $d^{5/2}/\varepsilon$, with $d$ the data dimension. Our theory offers rigorous insights into the validity and efficacy of consistency models, illuminating their utility in downstream inference tasks.

arxiv preprint arxiv, inequality, px 0, (13 more...)

arXiv.org Artificial Intelligence

2402.07802

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Hong Kong (0.04)

Genre:

Workflow (0.66)
Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback