AITopics | dq 0

Collaborating Authors

dq 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Computable Measure of Suboptimality for Entropy-Regularised Variational Objectives

Chazal, Clémentine, Kanagawa, Heishiro, Shen, Zheyang, Korba, Anna, Oates, Chris. J.

arXiv.org Machine LearningSep-15-2025

Several emerging post-Bayesian methods target a probability distribution for which an entropy-regularised variational objective is minimised. This increased flexibility introduces a computational challenge, as one loses access to an explicit unnormalised density for the target. To mitigate this difficulty, we introduce a novel measure of suboptimality called 'gradient discrepancy', and in particular a 'kernel gradient discrepancy' (KGD) that can be explicitly computed. In the standard Bayesian context, KGD coincides with the kernel Stein discrepancy (KSD), and we obtain a novel charasterisation of KSD as measuring the size of a variational gradient. Outside this familiar setting, KGD enables novel sampling algorithms to be developed and compared, even when unnormalised densities cannot be obtained. To illustrate this point several novel algorithms are proposed, including a natural generalisation of Stein variational gradient descent, with applications to mean-field neural networks and prediction-centric uncertainty quantification presented. On the theoretical side, our principal contribution is to establish sufficient conditions for desirable properties of KGD, such as continuity and convergence control.

kernel, kgd, stationary point, (15 more...)

arXiv.org Machine Learning

2509.10393

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre:

Instructional Material (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

97ea3cfb64eeaa1edba65501d0bb3c86-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 05:57:53 GMT

artificial intelligence, assumption 3, inequality, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.67)

Add feedback

Efficiently Access Diffusion Fisher: Within the Outer Product Span Space

Wang, Fangyikang, Yin, Hubery, Zhuang, Shaobin, Zhu, Huminhao, Li, Yinan, Qian, Lei, Zhang, Chao, Zhao, Hanbin, Qian, Hui, Li, Chen

arXiv.org Artificial IntelligenceMay-30-2025

Recent Diffusion models (DMs) advancements have explored incorporating the second-order diffusion Fisher information (DF), defined as the negative Hessian of log density, into various downstream tasks and theoretical analysis. However, current practices typically approximate the diffusion Fisher by applying auto-differentiation to the learned score network. This black-box method, though straightforward, lacks any accuracy guarantee and is time-consuming. In this paper, we show that the diffusion Fisher actually resides within a space spanned by the outer products of score and initial data. Based on the outer-product structure, we develop two efficient approximation algorithms to access the trace and matrix-vector multiplication of DF, respectively. These algorithms bypass the auto-differentiation operations with time-efficient vector-product calculations. Furthermore, we establish the approximation error bounds for the proposed algorithms. Experiments in likelihood evaluation and adjoint optimization demonstrate the superior accuracy and reduced computational cost of our proposed algorithms. Additionally, based on the novel outer-product formulation of DF, we design the first numerical verification experiment for the optimal transport property of the general PF-ODE deduced map.

artificial intelligence, fisher, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2505.23264

Country: North America (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (1.00)
Transportation (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate

Liang, Yuchen, Ju, Peizhong, Liang, Yingbin, Shroff, Ness

arXiv.org Machine LearningFeb-21-2024

The denoising diffusion model emerges recently as a powerful generative technique that converts noise into data. Theoretical convergence guarantee has been mainly studied for continuous-time diffusion models, and has been obtained for discrete-time diffusion models only for distributions with bounded support in the literature. In this paper, we establish the convergence guarantee for substantially larger classes of distributions under discrete-time diffusion models and further improve the convergence rate for distributions with bounded support. In particular, we first establish the convergence rates for both smooth and general (possibly non-smooth) distributions having finite second moment. We then specialize our results to a number of interesting classes of distributions with explicit parameter dependencies, including distributions with Lipschitz scores, Gaussian mixture distributions, and distributions with bounded support. We further propose a novel accelerated sampler and show that it improves the convergence rates of the corresponding regular sampler by orders of magnitude with respect to all system parameters. For distributions with bounded support, our result improves the dimensional dependence of the previous convergence rate by orders of magnitude. Our study features a novel analysis technique that constructs tilting factor representation of the convergence error and exploits Tweedie's formula for handling Taylor expansion power terms.

assumption 4, bounded support, dq 0, (16 more...)

arXiv.org Machine Learning

2402.13901

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
North America > United States > Ohio (0.04)

Genre: Research Report (1.00)

Industry: Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.45)

Add feedback

Generalization Bounds via Convex Analysis

Neu, Gergely, Lugosi, Gábor

arXiv.org Machine LearningFeb-10-2022

Since the celebrated works of Russo and Zou (2016,2019) and Xu and Raginsky (2017), it has been well known that the generalization error of supervised learning algorithms can be bounded in terms of the mutual information between their input and the output, given that the loss of any fixed hypothesis has a subgaussian tail. In this work, we generalize this result beyond the standard choice of Shannon's mutual information to measure the dependence between the input and the output. Our main result shows that it is indeed possible to replace the mutual information by any strongly convex function of the joint input-output distribution, with the subgaussianity condition on the losses replaced by a bound on an appropriately chosen norm capturing the geometry of the dependence measure. This allows us to derive a range of generalization bounds that are either entirely new or strengthen previously known ones. Examples include bounds stated in terms of $p$-norm divergences and the Wasserstein-2 distance, which are respectively applicable for heavy-tailed loss distributions and highly smooth loss functions. Our analysis is entirely based on elementary tools from convex analysis by tracking the growth of a potential function associated with the dependence measure and the loss function.

dependence measure, divergence, generalization error, (14 more...)

arXiv.org Machine Learning

2202.04985

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback