AITopics

2506.11285

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Levine, Lionel, Santerre, John, Young, Alex S., Levine, T. Barry, Campion, Francis, Sarrafzadeh, Majid

PRISM: A Transformer-based Language Model of Structured Clinical Event Data

arXiv.org Artificial IntelligenceJun-16-2025

--We introduce PRISM (Predictive Reasoning in Sequential Medicine), a transformer-based architecture designed to model the sequential progression of clinical decision-making processes. Unlike traditional approaches that rely on isolated diagnostic classification, PRISM frames clinical trajectories as tokenized sequences of events -- including diagnostic tests, laboratory results, and diagnoses -- and learns to predict the most probable next steps in the patient diagnostic journey. Leveraging a large custom clinical vocabulary and an autoregressive training objective, PRISM demonstrates the ability to capture complex dependencies across longitudinal patient timelines. Experimental results show substantial improvements over random baselines in next-token prediction tasks, with generated sequences reflecting realistic diagnostic pathways, laboratory result progressions, and clinician ordering behaviors. These findings highlight the feasibility of applying generative language modeling techniques to structured medical event data, enabling applications in clinical decision support, simulation, and education. PRISM establishes a foundation for future advancements in sequence-based healthcare modeling, bridging the gap between machine learning architectures and real-world diagnostic reasoning. Accurate and timely clinical decision-making is fundamental to high-quality patient care.

large language model, machine learning, natural language, (20 more...)

2506.11082

Country: North America > United States (0.15)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Health Care Technology (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Phi, Thu Ha, Hippert-Ferrer, Alexandre, Bouchard, Florent, Breloy, Arnaud

Leveraging Low-rank Factorizations of Conditional Correlation Matrices in Graph Learning

arXiv.org Artificial IntelligenceJun-13-2025

This paper addresses the problem of learning an undirected graph from data gathered at each nodes. Within the graph signal processing framework, the topology of such graph can be linked to the support of the conditional correlation matrix of the data. The corresponding graph learning problem then scales to the squares of the number of variables (nodes), which is usually problematic at large dimension. To tackle this issue, we propose a graph learning framework that leverages a low-rank factorization of the conditional correlation matrix. In order to solve for the resulting optimization problems, we derive tools required to apply Riemannian optimization techniques for this particular structure. The proposal is then particularized to a low-rank constrained counterpart of the GLasso algorithm, i.e., the penalized maximum likelihood estimation of a Gaussian graphical model. Experiments on synthetic and real data evidence that a very efficient dimension-versus-performance trade-off can be achieved with this approach.

artificial intelligence, machine learning, optimization problem, (19 more...)

2506.10628

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > France > Île-de-France > Hauts-de-Seine > Nanterre (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Arpogaus, Marcel, Kneib, Thomas, Nagler, Thomas, Rügamer, David

Hybrid Bernstein Normalizing Flows for Flexible Multivariate Density Regression with Interpretable Marginals

Density regression models allow a comprehensive understanding of data by modeling the complete conditional probability distribution. While flexible estimation approaches such as normalizing flows (NF) work particularly well in multiple dimensions, interpreting the input-output relationship of such models is often difficult, due to the black-box character of deep learning models. In contrast, existing statistical methods for multivariate outcomes such as multivariate conditional transformation models (MCTM) are restricted in flexibility and are often not expressive enough to represent complex multivariate probability distributions. In this paper, we combine MCTM with state-of-the-art and autoregressive NF to leverage the transparency of MCTM for modeling interpretable feature effects on the marginal distributions in the first step and the flexibility of neural-network-based NF techniques to account for complex and non-linear relationships in the joint data distribution. We demonstrate our method's versatility in various numerical experiments and compare it with MCTM and other NF models on both simulated and real-world data.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2505.14164

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Europe > Germany > Lower Saxony > Gottingen (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Sanokowski, Sebastian, Gruber, Lukas, Bartmann, Christoph, Hochreiter, Sepp, Lehner, Sebastian

Rethinking Losses for Diffusion Bridge Samplers

Diffusion bridges are a promising class of deep-learning methods for sampling from unnormalized distributions. Recent works show that the Log Variance (LV) loss consistently outperforms the reverse Kullback-Leibler (rKL) loss when using the reparametrization trick to compute rKL-gradients. While the on-policy LV loss yields identical gradients to the rKL loss when combined with the log-derivative trick for diffusion samplers with non-learnable forward processes, this equivalence does not hold for diffusion bridges or when diffusion coefficients are learned. Based on this insight we argue that for diffusion bridges the LV loss does not represent an optimization objective that can be motivated like the rKL loss via the data processing inequality. Our analysis shows that employing the rKL loss with the log-derivative trick (rKL-LD) does not only avoid these conceptual problems but also consistently outperforms the LV loss. Experimental results with different types of diffusion bridges on challenging benchmarks show that samplers trained with the rKL-LD loss achieve better performance. From a practical perspective we find that rKL-LD requires significantly less hyperparameter optimization and yields more stable training behavior.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2506.10982

Country:

Europe > Austria > Upper Austria > Linz (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Textual Bayes: Quantifying Uncertainty in LLM-Based Systems

Ross, Brendan Leigh, Vouitsis, Noël, Ghomi, Atiyeh Ashari, Hosseinzadeh, Rasa, Xin, Ji, Liu, Zhaoyan, Sui, Yi, Hou, Shiyi, Leung, Kin Kwan, Loaiza-Ganem, Gabriel, Cresswell, Jesse C.

Although large language models (LLMs) are becoming increasingly capable of solving challenging real-world tasks, accurately quantifying their uncertainty remains a critical open problem, which limits their applicability in high-stakes domains. This challenge is further compounded by the closed-source, black-box nature of many state-of-the-art LLMs. Moreover, LLM-based systems can be highly sensitive to the prompts that bind them together, which often require significant manual tuning (i.e., prompt engineering). In this work, we address these challenges by viewing LLM-based systems through a Bayesian lens. We interpret prompts as textual parameters in a statistical model, allowing us to use a small training dataset to perform Bayesian inference over these prompts. This novel perspective enables principled uncertainty quantification over both the model's textual parameters and its downstream predictions, while also incorporating prior beliefs about these parameters expressed in free-form text. To perform Bayesian inference, a difficult problem even for well-studied data modalities, we introduce Metropolis-Hastings through LLM Proposals (MHLP), a novel Markov chain Monte Carlo (MCMC) algorithm that combines prompt optimization techniques with standard MCMC methods. MHLP is a turnkey modification to existing LLM pipelines, including those that rely exclusively on closed-source models. Empirically, we demonstrate that our method yields improvements in both predictive accuracy and uncertainty quantification (UQ) on a range of LLM benchmarks and UQ tasks. More broadly, our work demonstrates a viable path for incorporating methods from the rich Bayesian literature into the era of LLMs, paving the way for more reliable and calibrated LLM-based systems.

large language model, machine learning, natural language, (16 more...)

2506.1006

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Northumberland (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Instructional Material > Course Syllabus & Notes (0.67)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Handke, Florian, Koulischer, Félix, Raya, Gabriel, Ambrogioni, Luca

Measuring Semantic Information Production in Generative Diffusion Models

It is well known that semantic and structural features of the generated images emerge at different times during the reverse dynamics of diffusion, a phenomenon that has been connected to physical phase transitions in magnets and other materials. In this paper, we introduce a general information-theoretic approach to measure when these class-semantic "decisions" are made during the generative process. By using an online formula for the optimal Bayesian classifier, we estimate the conditional entropy of the class label given the noisy state. We then determine the time intervals corresponding to the highest information transfer between noisy states and class labels using the time derivative of the conditional entropy. We demonstrate our method on one-dimensional Gaussian mixture models and on DDPM models trained on the CIFAR10 dataset. As expected, we find that the semantic information transfer is highest in the intermediate stages of diffusion while vanishing during the final stages. However, we found sizable differences between the entropy rate profiles of different classes, suggesting that different "semantic decisions" are located at different intermediate times.

artificial intelligence, machine learning, natural language, (14 more...)

2506.10433

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)

Haddouche, Maxime, Sakhi, Otmane

Logarithmic Smoothing for Adaptive PAC-Bayesian Off-Policy Learning

Off-policy learning serves as the primary framework for learning optimal policies from logged interactions collected under a static behavior policy. In this work, we investigate the more practical and flexible setting of adaptive off-policy learning, where policies are iteratively refined and re-deployed to collect higher-quality data. Building on the success of PAC-Bayesian learning with Logarithmic Smoothing (LS) in static settings, we extend this framework to the adaptive scenario using tools from online PAC-Bayesian theory. Furthermore, we demonstrate that a principled adjustment to the LS estimator naturally accommodates multiple rounds of deployment and yields faster convergence rates under mild conditions. Our method matches the performance of leading offline approaches in static settings, and significantly outperforms them when intermediate policy deployments are allowed. Empirical evaluations across diverse scenarios highlight both the advantages of adaptive data collection and the strength of the PAC-Bayesian formulation.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2506.10664

Country:

North America > United States (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Alzboon, Mowafaq Salem, Al-Batah, Mohammad, Alqaraleh, Muhyeeddin, Abuashour, Ahmad, Bader, Ahmad Fuad

A Comparative Study of Machine Learning Techniques for Early Prediction of Diabetes

arXiv.org Artificial IntelligenceJun-13-2025

-- In many nations, diabetes is becoming a significant health problem, and early identi - fication and control are crucial. Using machine learning algorithms to predict diabetes has yielded encouraging results. Using the Pima Indians Dia - betes dataset, this study attempts to evaluate the efficacy of several machine - learning methods for diabetes prediction. The collection includes infor - mation on 768 patients, such as their ages, BMIs, and glucose levels. The techniques assessed are Logistic Regression, Decision Tree, Random Forest, k - Nearest Neighbors, Naive Bayes, Support Vector Machine, Gradient Boosting, and Neural Network. The findings indicate that the Neural Network algorithm performed the best, with an accuracy of 78.57 The study implies that machine learning algorithms can aid diabetes prediction and be an efficient early detection tool. Diabetes is a chronic metabolic disease af - fecting millions worldwide and is a significant cause of morbidity and death [1]. High blood glucose levels characterize the disorder and can result in some complications, including cardiovascular disease, stroke, blindness, and amputations. To prevent or postpone com - plications, diabetes must be recognized and treated as soon as feasible; however, this can be challenging because symptoms may be mild or absent [2]. Machine learning (ML) is a subfield of artificial intelligence that comprises the de - velopment of algorithms that can learn from data and generate inferences or predictions without being explicitly programmed. ML algorithms are beneficial in several fields, in - cluding healthcare.

artificial intelligence, diabetes, machine learning, (15 more...)

doi: 10.1109/ComNet60156.2023.10366688

2506.1018

Country: Asia > Middle East > Jordan (0.15)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.90)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Fickinger, Arnaud, Bendahi, Abderrahim, Russell, Stuart

Provable Sim-to-Real Transfer via Offline Domain Randomization

arXiv.org Artificial IntelligenceJun-13-2025

Reinforcement-learning agents often struggle when deployed from simulation to the real-world. A dominant strategy for reducing the sim-to-real gap is domain randomization (DR) which trains the policy across many simulators produced by sampling dynamics parameters, but standard DR ignores offline data already available from the real system. We study offline domain randomization (ODR), which first fits a distribution over simulator parameters to an offline dataset. While a growing body of empirical work reports substantial gains with algorithms such as DROPO, the theoretical foundations of ODR remain largely unexplored. In this work, we (i) formalize ODR as a maximum-likelihood estimation over a parametric simulator family, (ii) prove consistency of this estimator under mild regularity and identifiability conditions, showing it converges to the true dynamics as the dataset grows, (iii) derive gap bounds demonstrating ODRs sim-to-real error is up to an O(M) factor tighter than uniform DR in the finite-simulator case (and analogous gains in the continuous setting), and (iv) introduce E-DROPO, a new version of DROPO which adds an entropy bonus to prevent variance collapse, yielding broader randomization and more robust zero-shot transfer in practice.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

2506.10133

Genre: Research Report (0.42)

Industry: Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)