AITopics | mdn

Collaborating Authors

mdn

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

007ff380ee5ac49ffc34442f5c2a2b86-AuthorFeedback.pdf

Neural Information Processing SystemsApr-30-2026, 19:38:59 GMT

artificial intelligence, machine learning, manuscript, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Beyond the Mean: Distribution-Aware Loss Functions for Bimodal Regression

Mohammadi-Seif, Abolfazl, Soares, Carlos, Ribeiro, Rita P., Baeza-Yates, Ricardo

arXiv.org Machine LearningMar-25-2026

Despite the strong predictive performance achieved by machine learning models across many application domains, assessing their trustworthiness through reliable estimates of predictive confidence remains a critical challenge. This issue arises in scenarios where the likelihood of error inferred from learned representations follows a bimodal distribution, resulting from the coexistence of confident and ambiguous predictions. Standard regression approaches often struggle to adequately express this predictive uncertainty, as they implicitly assume unimodal Gaussian noise, leading to mean-collapse behavior in such settings. Although Mixture Density Networks (MDNs) can represent different distributions, they suffer from severe optimization instability. We propose a family of distribution-aware loss functions integrating normalized RMSE with Wasserstein and Cramér distances. When applied to standard deep regression models, our approach recovers bimodal distributions without the volatility of mixture models. Validated across four experimental stages, our results show that the proposed Wasserstein loss establishes a new Pareto efficiency frontier: matching the stability of standard regression losses like MSE in unimodal tasks while reducing Jensen-Shannon Divergence by 45% on complex bimodal datasets. Our framework strictly dominates MDNs in both fidelity and robustness, offering a reliable tool for aleatoric uncertainty estimation in trustworthy AI systems.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2603.22328

Country:

North America > United States > California (0.04)
Europe > Portugal > Porto > Porto (0.04)
North America > United States > Virginia > Hampton (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation

George Papamakarios, Iain Murray

Neural Information Processing SystemsMar-23-2026, 11:39:41 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

Multimodal Scientific Learning Beyond Diffusions and Flows

Guilhoto, Leonardo Ferreira, Kaushal, Akshat, Perdikaris, Paris

arXiv.org Machine LearningFeb-3-2026

Scientific machine learning (SciML) increasingly requires models that capture multimodal conditional uncertainty arising from ill-posed inverse problems, multistability, and chaotic dynamics. While recent work has favored highly expressive implicit generative models such as diffusion and flow-based methods, these approaches are often data-hungry, computationally costly, and misaligned with the structured solution spaces frequently found in scientific problems. We demonstrate that Mixture Density Networks (MDNs) provide a principled yet largely overlooked alternative for multimodal uncertainty quantification in SciML. As explicit parametric density estimators, MDNs impose an inductive bias tailored to low-dimensional, multimodal physics, enabling direct global allocation of probability mass across distinct solution branches. This structure delivers strong data efficiency, allowing reliable recovery of separated modes in regimes where scientific data is scarce. We formalize these insights through a unified probabilistic framework contrasting explicit and implicit distribution networks, and demonstrate empirically that MDNs achieve superior generalization, interpretability, and sample efficiency across a range of inverse, multistable, and chaotic scientific regression tasks.

artificial intelligence, machine learning, multimodal scientific learning, (16 more...)

arXiv.org Machine Learning

2602.0096

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback

Bayesian Neural Networks vs. Mixture Density Networks: Theoretical and Empirical Insights for Uncertainty-Aware Nonlinear Modeling

Ghosh, Riddhi Pratim, Barnett, Ian

arXiv.org Artificial IntelligenceOct-30-2025

Modeling complex, non-linear, and uncertain relationships between input and output variables remains a central challenge in modern statistical learning and artificial intelligence. Traditional neural networks, trained via point estimation, have demonstrated remarkable success in a variety of domains but inherently provide deterministic predictions - that is, single-valued outputs without accompanying measures of uncertainty. This limitation becomes critical in domains characterized by limited, noisy, or ambiguous data, such as medicine, climate science, or finance, where quantifying uncertainty is as important as producing accurate predictions (Gal & Ghahramani, 2016; Kendall & Gal, 2017; Abdar et al., 2021). Bayesian Neural Networks (BNNs) provide a probabilistic extension of standard neural networks by treating weights and biases as random variables endowed with prior distributions (MacKay, 1992; Neal, 2012). Through Bayes' theorem, BNNs infer a posterior distribution over weights, allowing predictions to reflect epistemic uncertainty - the uncertainty arising from limited data and model knowledge. However, the exact posterior is analytically intractable for deep models, motivating approximate inference methods such as variational inference (Graves, 2011; Blundell et al., 2015) and Monte Carlo dropout (Gal & Ghahramani, 2016). Despite their appeal, these approaches may yield biased or overconfident posteriors due to restrictive variational families (Hern andez-Lobato & Adams, 2015a; Osband et al., 2023), often resulting in over-smoothed predictive distributions. An alternative paradigm for probabilistic modeling is the Mixture Density Network (MDN), introduced by Bridle (1990) and developed further by Jacobs et al. (1991).

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2510.25001

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control

Zhu, Yuchen, Guo, Wei, Choi, Jaemoo, Liu, Guan-Horng, Chen, Yongxin, Tao, Molei

arXiv.org Machine LearningAug-18-2025

We study the problem of learning a neural sampler to generate samples from discrete state spaces where the target probability mass function $π\propto\mathrm{e}^{-U}$ is known up to a normalizing constant, which is an important task in fields such as statistical physics, machine learning, combinatorial optimization, etc. To better address this challenging task when the state space has a large cardinality and the distribution is multi-modal, we propose $\textbf{M}$asked $\textbf{D}$iffusion $\textbf{N}$eural $\textbf{S}$ampler ($\textbf{MDNS}$), a novel framework for training discrete neural samplers by aligning two path measures through a family of learning objectives, theoretically grounded in the stochastic optimal control of the continuous-time Markov chains. We validate the efficiency and scalability of MDNS through extensive experiments on various distributions with distinct statistical properties, where MDNS learns to accurately sample from the target distributions despite the extremely high problem dimensions and outperforms other learning-based baselines by a large margin. A comprehensive study of ablations and extensions is also provided to demonstrate the efficacy and potential of the proposed framework.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2508.10684

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

Add feedback

A Comprehensive Framework for Uncertainty Quantification of Voxel-wise Supervised Models in IVIM MRI

Casali, Nicola, Brusaferri, Alessandro, Baselli, Giuseppe, Fumagalli, Stefano, Micotti, Edoardo, Forloni, Gianluigi, Hussein, Riaz, Rizzo, Giovanna, Mastropietro, Alfonso

arXiv.org Artificial IntelligenceAug-8-2025

Accurate estimation of intravoxel incoherent motion (IVIM) parameters from diffusion-weighted MRI remains challenging due to the ill-posed nature of the inverse problem and high sensitivity to noise, particularly in the perfusion compartment. In this work, we propose a probabilistic deep learning framework based on Deep Ensembles (DE) of Mixture Density Networks (MDNs), enabling estimation of total predictive uncertainty and decomposition into aleatoric (AU) and epistemic (EU) components. The method was benchmarked against non probabilistic neural networks, a Bayesian fitting approach and a probabilistic network with single Gaussian parametrization. Supervised training was performed on synthetic data, and evaluation was conducted on both simulated and an in vivo dataset. The reliability of the quantified uncertainties was assessed using calibration curves, output distribution sharpness, and the Continuous Ranked Probability Score (CRPS). MDNs produced more calibrated and sharper predictive distributions for the diffusion coefficient D and fraction f parameters, although slight overconfidence was observed in pseudo-diffusion coefficient D*. The Robust Coefficient of Variation (RCV) indicated smoother in vivo estimates for D* with MDNs compared to Gaussian model. Despite the training data covering the expected physiological range, elevated EU in vivo suggests a mismatch with real acquisition conditions, highlighting the importance of incorporating EU, which was allowed by DE. Overall, we present a comprehensive framework for IVIM fitting with uncertainty quantification, which enables the identification and interpretation of unreliable estimates. The proposed approach can also be adopted for fitting other physical models through appropriate architectural and simulation adjustments.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.04588

Country: Europe (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.69)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Add feedback

Interaction Techniques that Encourage Longer Prompts Can Improve Psychological Ownership when Writing with AI

Joshi, Nikhita, Vogel, Daniel

arXiv.org Artificial IntelligenceJul-8-2025

Writing longer prompts for an AI assistant to generate a short story increases psychological ownership, a user's feeling that the writing belongs to them. To encourage users to write longer prompts, we evaluated two interaction techniques that modify the prompt entry interface of chat-based generative AI assistants: pressing and holding the prompt submission button, and continuously moving a slider up and down when submitting a short prompt. A within-subjects experiment investigated the effects of such techniques on prompt length and psychological ownership, and results showed that these techniques increased prompt length and led to higher psychological ownership than baseline techniques. A second experiment further augmented these techniques by showing AI-generated suggestions for how the prompts could be expanded. This further increased prompt length, but did not lead to improvements in psychological ownership. Our results show that simple interface modifications like these can elicit more writing from users and improve psychological ownership.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.0367

Country:

North America > Canada > Ontario > Waterloo Region > Waterloo (0.14)
Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.14)
North America > United States > New York > New York County > New York City (0.06)
(12 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Variational Autoencoder Framework for Hyperspectral Retrievals (Hyper-VAE) of Phytoplankton Absorption and Chlorophyll a in Coastal Waters for NASA's EMIT and PACE Missions

Lou, Jiadong, Liu, Bingqing, Xiong, Yuanheng, Zhang, Xiaodong, Yuan, Xu

arXiv.org Artificial IntelligenceApr-21-2025

Phytoplankton is an extremely diverse set of microorganisms, varying in cell morphologies, biogeochemical functions, and physiological responses to environmental disturbances [21]. As the primary producer of ocean's food web, phytoplankton produces approximately 50 percent of Earth's oxygen, regulate the global carbon cycle and climate, and support various ecosystem services, such as fisheries, water quality, and biodiversity [7]. Therefore, knowledge of phytoplankton biomass and their community composition is critical to understanding the food web structure, higher trophic level production (e.g., fisheries), and biological shifts among other complex Earth Science questions, especially in the context of degraded water quality (e.g., eutrophication) and climate change (e.g., warming temperatures), demanding attention at local, regional, and global scales [9]. As such, there is an increasing interdisciplinary interest in studying phytoplankton community dynamics in estuarine-coastal waters, where massive riverine inputs of nutrient-rich freshwaters often lead to eutrophication, harmful algal blooms (HABs), and the annual recurrence of bottom-water hypoxia events, which cause widespread and severe impacts on the aquatic ecosystem [3], [43], [57], [23], [54]. In the field of ocean color remote sensing, the concentration of chlorophyll a (Chl-a) and phytoplankton absorption properties (aphy) are two of the most commonly used metrics for assessing phytoplankton abundance and diversity in aquatic environments [41], [56]. Those phytoplankton-related satellite algorithms are rooted in the physical principle that remote sensing reflectance (Rrs, sr 1)), the ratio of water-leaving radiance to the total downwelling irradiance just above water, is determined by the inherent optical properties (IOPs), most importantly the total backscattering coefficient (btotal, m 1) and the total absorption coefficient (atotal, m 1) [38].

artificial intelligence, machine learning, phy, (16 more...)

arXiv.org Artificial Intelligence

2504.13476

Country: North America > United States > Delaware > New Castle County > Newark (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.88)
Energy (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Expectations, Explanations, and Embodiment: Attempts at Robot Failure Recovery

Yadollahi, Elmira, Dogan, Fethiye Irmak, Zhang, Yujing, Nogueira, Beatriz, Guerreiro, Tiago, Tzedek, Shelly Levy, Leite, Iolanda

arXiv.org Artificial IntelligenceApr-11-2025

Expectations critically shape how people form judgments about robots, influencing whether they view failures as minor technical glitches or deal-breaking flaws. This work explores how high and low expectations, induced through brief video priming, affect user perceptions of robot failures and the utility of explanations in HRI. We conducted two online studies ( N = 600 total participants); each replicated two robots with different embodiments, Furhat and Pepper. In our first study, grounded in expectation theory, participants were divided into two groups, one primed with positive and the other with negative expectations regarding the robot's performance, establishing distinct expectation frameworks. This validation study aimed to verify whether the videos could reliably establish low and high-expectation profiles. In the second study, participants were primed using the validated videos and then viewed a new scenario in which the robot failed at a task. Half viewed a version where the robot explained its failure, while the other half received no explanation. We found that explanations significantly improved user perceptions of Furhat, especially when participants were primed to have lower expectations. Explanations boosted satisfaction and enhanced the robot's perceived expressiveness, indicating that effectively communicat-Authors contributed equally. By contrast, Pepper's explanations produced minimal impact on user attitudes, suggesting that a robot's embodiment and style of interaction could determine whether explanations can successfully offset negative impressions. Together, these findings underscore the need to consider users' expectations when tailoring explanation strategies in HRI. When expectations are initially low, a cogent explanation can make the difference between dismissing a failure and appreciating the robot's transparency and effort to communicate. Keywords: Expectations, Explanations, Explainability, Human-Robot Interaction, Priming 1. Introduction When robots operate in human environments, user expectations play a crucial role in shaping human-robot interaction (HRI) (Lohse, 2009; Horstmann and Kr amer, 2020; Dogan et al., 2025). However, there is often a mismatch between these expectations and the actual capabilities of social robots (Ros en et al., 2022), which can lead to disappointment and, consequently, diminish the quality of interactions (Olson et al., 1996; Kruglanski and Sleeth-Keppler, 2007). For instance, a user might expect robots to function as proactive and autonomous assistants, yet when robots make mistakes due to their limited abilities, this mismatch can undermine the robot's perceived trustworthiness and competence (Salem et al., 2015; Cha et al., 2015).

artificial intelligence, natural language, robot, (17 more...)

arXiv.org Artificial Intelligence

2504.07266

Country:

Europe (0.46)
Asia (0.28)
North America > United States (0.15)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.67)

Add feedback