AITopics

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)

Neural Information Processing SystemsJun-17-2026, 17:24:01 GMT

MAPEstimation with Denoisers: Convergence Rates and Guarantees

Denoiser models have become powerful tools for inverse problems, enabling the use of pretrained networks to approximate the score of a smoothed prior distribution. These models are often used in heuristic iterative schemes aimed at solving Maximum a Posteriori (MAP) optimisation problems, where the proximal operator of the negative log-prior plays a central role. In practice, this operator is intractable, and practitioners plug in a pretrained denoiser as a surrogate--despite the lack of general theoretical justification for this substitution. In this work, we show that a simple algorithm, closely related to several used in practice, provably converges to the proximal operator under a log-concavity assumption on the prior p. We show that this algorithm can be interpreted as a gradient descent on smoothed proximal objectives. Our analysis thus provides a theoretical foundation for a class of empirically successful but previously heuristic methods.

artificial intelligence, inequality, machine learning, (17 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Romeres, Diego, Prando, Giulia, Pillonetto, Gianluigi, Chiuso, Alessandro

On-line Bayesian System Identification

arXiv.org Machine LearningJun-4-2026

We consider an on-line system identification setting, in which new data become available at given time steps. In order to meet real-time estimation requirements, we propose a tailored Bayesian system identification procedure, in which the hyper-parameters are still updated through Marginal Likelihood maximization, but after only one iteration of a suitable iterative optimization algorithm. Both gradient methods and the EM algorithm are considered for the Marginal Likelihood optimization. We compare this "1-step" procedure with the standard one, in which the optimization method is run until convergence to a local minimum. The experiments we perform confirm the effectiveness of the approach we propose.

artificial intelligence, machine learning, procedure, (18 more...)

arXiv.org Machine Learning

1601.04251

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsFeb-19-2026, 02:05:10 GMT

Appendix ADerivationof Monte Carlo Expectation Maximization (MCEM)

artificial intelligence, lnp, machine learning, (7 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Neural Information Processing SystemsFeb-10-2026, 08:16:04 GMT

wherethelastequalityissimplyarearrangementofterms. 14

We wish to optimize the likelihood of the sequence conditioned on the start and the goal frame p(o2:T 1|o1,T).

artificial intelligence, machine learning, sequence, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Belenguer-Llorens, Albert, Sevilla-Salcedo, Carlos, Mourao-Miranda, Janaina, Gómez-Verdejo, Vanessa

Interpretable Generative and Discriminative Learning for Multimodal and Incomplete Clinical Data

arXiv.org Machine LearningOct-13-2025

Real-world clinical problems are often characterized by multimodal data, usually associated with incomplete views and limited sample sizes in their cohorts, posing significant limitations for machine learning algorithms. In this work, we propose a Bayesian approach designed to efficiently handle these challenges while providing interpretable solutions. Our approach integrates (1) a generative formulation to capture cross-view relationships with a semi-supervised strategy, and (2) a discriminative task-oriented formulation to identify relevant information for specific downstream objectives. This dual generative-discriminative formulation offers both general understanding and task-specific insights; thus, it provides an automatic imputation of the missing views while enabling robust inference across different data sources. The potential of this approach becomes evident when applied to the multimodal clinical data, where our algorithm is able to capture and disentangle the complex interactions among biological, psychological, and sociodemographic modalities.

artificial intelligence, const, machine learning, (19 more...)

arXiv.org Machine Learning

2510.09513

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Canada > Quebec > Montreal (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(2 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Björkman, Zachris, Loría, Jorge, Wharrie, Sophie, Kaski, Samuel

Incorporating Expert Knowledge into Bayesian Causal Discovery of Mixtures of Directed Acyclic Graphs

arXiv.org Artificial IntelligenceOct-9-2025

Bayesian causal discovery benefits from prior information elicited from domain experts, and in heterogeneous domains any prior knowledge would be badly needed. However, so far prior elicitation approaches have assumed a single causal graph and hence are not suited to heterogeneous domains. We propose a causal elicitation strategy for heterogeneous settings, based on Bayesian experimental design (BED) principles, and a variational mixture structure learning (VaMSL) method -- extending the earlier differentiable Bayesian structure learning (DiBS) method -- to iteratively infer mixtures of causal Bayesian networks (CBNs). We construct an informative graph prior incorporating elicited expert feedback in the inference of mixtures of CBNs. Our proposed method successfully produces a set of alternative causal models (mixture components or clusters), and achieves an improved structure learning performance on heterogeneous synthetic data when informed by a simulated expert. Finally, we demonstrate that our approach is capable of capturing complex distributions in a breast cancer database.

artificial intelligence, bayesian inference, machine learning, (15 more...)

2510.06735

Country: North America > United States (0.92)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

van Oostrum, Jesse, Langer, Carlotta, Ay, Nihat

A Concise Mathematical Description of Active Inference in Discrete Time

arXiv.org Artificial IntelligenceJun-11-2024

Active inference is a theory that describes the behavior (action selection mechanism) of an agent in an environment. We aim to present a concise mathematical description of the theory so that a reader interested in the mathematical details can quickly find what they are looking for. We have paid special attention to choosing notation that is more in line with standard mathematical texts and is also descriptive, in the sense that dependencies are made explicit. The aim of this paper is not to justify the theory or convince the reader that this is right theory. The paper is divided into a main text and an appendix. The main text aims to present a clear and simple picture of active inference in discrete time that is accessible for people new to the topic. It is further subdivided into an inference part, which assumes the existence of a generative model, a learning part, in which we discuss how the agent can learn this model, and an example, illustrating the action selection mechanism. In the appendix the more subtle details and derivations are discussed. This part is aimed at people who have already studied the active inference literature but struggle to make sense of the mathematical details.

cue location, generative model, inference, (15 more...)

2406.07726

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)

Martini, K. Michael, Nemenman, Ilya

Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck

arXiv.org Artificial IntelligenceFeb-2-2024

The Symmetric Information Bottleneck (SIB), an extension of the more familiar Information Bottleneck, is a dimensionality reduction technique that simultaneously compresses two random variables to preserve information between their compressed versions. We introduce the Generalized Symmetric Information Bottleneck (GSIB), which explores different functional forms of the cost of such simultaneous reduction. We then explore the dataset size requirements of such simultaneous compression. We do this by deriving bounds and root-mean-squared estimates of statistical fluctuations of the involved loss functions. We show that, in typical situations, the simultaneous GSIB compression requires qualitatively less data to achieve the same errors compared to compressing variables one at a time. We suggest that this is an example of a more general principle that simultaneous compression is more data efficient than independent compression of each of the input variables.

information, information bottleneck, loss function, (13 more...)

2309.05649

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.14)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Chen, Wei-Rui, Adebara, Ife, Doan, Khai Duy, Liao, Qisheng, Abdul-Mageed, Muhammad

Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability

arXiv.org Artificial IntelligenceNov-16-2023

Recently, ChatGPT has emerged as a powerful NLP tool that can carry out several tasks. However, the range of languages ChatGPT can handle remains largely a mystery. In this work, we investigate ChatGPT's language identification abilities. For this purpose, we compile Babel-670, a benchmark comprising $670$ languages representing $23$ language families. Languages in Babel-670 run the gamut between the very high-resource to the very low-resource and are spoken in five continents. We then study ChatGPT's (both GPT-3.5 and GPT-4) ability to (i) identify both language names and language codes (ii) under both zero- and few-shot conditions (iii) with and without provision of label set. When compared to smaller finetuned language identification tools, we find that ChatGPT lags behind. Our empirical analysis shows the reality that ChatGPT still resides in a state of potential enhancement before it can sufficiently serve diverse communities.

chatgpt, language code, language name, (17 more...)

2311.09696

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Canada > British Columbia (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)