AITopics

2501.18528

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)

Barbier, Jean, Camilli, Francesco, Nguyen, Minh-Toan, Pastore, Mauro, Skerk, Rudy

Optimal generalisation and learning transition in extensive-width shallow neural networks near interpolation

arXiv.org Machine LearningJan-30-2025

We consider a teacher-student model of supervised learning with a fully-trained 2-layer neural network whose width $k$ and input dimension $d$ are large and proportional. We compute the Bayes-optimal generalisation error of the network for any activation function in the regime where the number of training data $n$ scales quadratically with the input dimension, i.e., around the interpolation threshold where the number of trainable parameters $kd+k$ and of data points $n$ are comparable. Our analysis tackles generic weight distributions. Focusing on binary weights, we uncover a discontinuous phase transition separating a "universal" phase from a "specialisation" phase. In the first, the generalisation error is independent of the weight distribution and decays slowly with the sampling rate $n/d^2$, with the student learning only some non-linear combinations of the teacher weights. In the latter, the error is weight distribution-dependent and decays faster due to the alignment of the student towards the teacher network. We thus unveil the existence of a highly predictive solution near interpolation, which is however potentially hard to find.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2501.1853

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Basu, Sattwik, Dutta, Debottam, Wei, Yu-Lin, Choudhury, Romit Roy

Estimating Multi-chirp Parameters using Curvature-guided Langevin Monte Carlo

arXiv.org Machine LearningJan-30-2025

This paper considers the problem of estimating chirp parameters from a noisy mixture of chirps. While a rich body of work exists in this area, challenges remain when extending these techniques to chirps of higher order polynomials. We formulate this as a non-convex optimization problem and propose a modified Langevin Monte Carlo (LMC) sampler that exploits the average curvature of the objective function to reliably find the minimizer. Results show that our Curvature-guided LMC (CG-LMC) algorithm is robust and succeeds even in low SNR regimes, making it viable for practical applications.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2501.18178

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Amani, Mani, Akhavian, Reza

Bayesian BIM-Guided Construction Robot Navigation with NLP Safety Prompts in Dynamic Environments

arXiv.org Artificial IntelligenceJan-29-2025

Construction robotics increasingly relies on natural language processing for task execution, creating a need for robust methods to interpret commands in complex, dynamic environments. While existing research primarily focuses on what tasks robots should perform, less attention has been paid to how these tasks should be executed safely and efficiently. This paper presents a novel probabilistic framework that uses sentiment analysis from natural language commands to dynamically adjust robot navigation policies in construction environments. The framework leverages Building Information Modeling (BIM) data and natural language prompts to create adaptive navigation strategies that account for varying levels of environmental risk and uncertainty. We introduce an object-aware path planning approach that combines exponential potential fields with a grid-based representation of the environment, where the potential fields are dynamically adjusted based on the semantic analysis of user prompts. The framework employs Bayesian inference to consolidate multiple information sources: the static data from BIM, the semantic content of natural language commands, and the implied safety constraints from user prompts. We demonstrate our approach through experiments comparing three scenarios: baseline shortest-path planning, safety-oriented navigation, and risk-aware routing. Results show that our method successfully adapts path planning based on natural language sentiment, achieving a 50\% improvement in minimum distance to obstacles when safety is prioritized, while maintaining reasonable path lengths. Scenarios with contrasting prompts, such as "dangerous" and "safe", demonstrate the framework's ability to modify paths. This approach provides a flexible foundation for integrating human knowledge and safety considerations into construction robot navigation.

artificial intelligence, large language model, natural language, (17 more...)

2501.17437

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)

Chen, Alex, Chlenski, Philipe, Munyuza, Kenneth, Moretti, Antonio Khalil, Naesseth, Christian A., Pe'er, Itsik

Variational Combinatorial Sequential Monte Carlo for Bayesian Phylogenetics in Hyperbolic Space

arXiv.org Machine LearningJan-29-2025

Hyperbolic space naturally encodes hierarchical structures such as phylogenies (binary trees), where inward-bending geodesics reflect paths through least common ancestors, and the exponential growth of neighborhoods mirrors the super-exponential scaling of topologies. This scaling challenge limits the efficiency of Euclidean-based approximate inference methods. Motivated by the geometric connections between trees and hyperbolic space, we develop novel hyperbolic extensions of two sequential search algorithms: Combinatorial and Nested Combinatorial Sequential Monte Carlo (\textsc{Csmc} and \textsc{Ncsmc}). Our approach introduces consistent and unbiased estimators, along with variational inference methods (\textsc{H-Vcsmc} and \textsc{H-Vncsmc}), which outperform their Euclidean counterparts. Empirical results demonstrate improved speed, scalability and performance in high-dimensional phylogenetic inference tasks.

artificial intelligence, bayesian inference, combinatorial sequential monte carlo, (14 more...)

2501.17965

Country:

Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Yoo, Se-Wook, Seo, Seung-Woo

DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints for Safety-Critical Systems

arXiv.org Artificial IntelligenceJan-29-2025

Safe reinforcement learning has traditionally relied on predefined constraint functions to ensure safety in complex real-world tasks, such as autonomous driving. However, defining these functions accurately for varied tasks is a persistent challenge. Recent research highlights the potential of leveraging pre-acquired task-agnostic knowledge to enhance both safety and sample efficiency in related tasks. Building on this insight, we propose a novel method to learn shared constraint distributions across multiple tasks. Our approach identifies the shared constraints through imitation learning and then adapts to new tasks by adjusting risk levels within these learned distributions. This adaptability addresses variations in risk sensitivity stemming from expert-specific biases, ensuring consistent adherence to general safety principles even with imperfect demonstrations. Our method can be applied to control and navigation domains, including multi-task and meta-task scenarios, accommodating constraints such as maintaining safe distances or adhering to speed limits. Experimental results validate the efficacy of our approach, demonstrating superior safety performance and success rates compared to baselines, all without requiring task-specific constraint definitions. These findings underscore the versatility and practicality of our method across a wide range of real-world tasks.

constraint, machine learning, reinforcement learning, (16 more...)

2501.18086

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.48)

Industry: Transportation > Ground > Road (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Swaroop, Siddharth, Khan, Mohammad Emtiyaz, Doshi-Velez, Finale

Connecting Federated ADMM to Bayes

arXiv.org Machine LearningJan-28-2025

We provide new connections between two distinct federated learning approaches based on (i) ADMM and (ii) Variational Bayes (VB), and propose new variants by combining their complementary strengths. Specifically, we show that the dual variables in ADMM naturally emerge through the "site" parameters used in VB with isotropic Gaussian covariances. Using this, we derive two versions of ADMM from VB that use flexible covariances and functional regularisation, respectively. Through numerical experiments, we validate the improvements obtained in performance. The work shows connection between two fields that are believed to be fundamentally different and combines them to improve federated learning. The goal of federated learning is to train a global model in the central server by using the data distributed over many local clients (McMahan et al., 2016). Such distributed learning improves privacy, security, and robustness, but is challenging due to frequent communication needed to synchronise training among nodes. This is especially true when the data quality differs drastically from client to client and needs to be appropriately weighted. Designing new methods to deal with such challenges is an active area of research in federated learning. We focus on two distinct federated-learning approaches based on the Alternating Direction Method of Multipliers (ADMM) and Variational Bayes (VB), respectively. The ADMM approach synchronises the global and local models by using constrained optimisation and updates both primal and dual variables simultaneously.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2501.17325

Country:

Europe > Austria > Vienna (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Virginia (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Reuter, Arik, Rudner, Tim G. J., Fortuin, Vincent, Rügamer, David

Can Transformers Learn Full Bayesian Inference in Context?

arXiv.org Artificial IntelligenceJan-28-2025

Transformers have emerged as the dominant architecture in the field of deep learning, with a broad range of applications and remarkable in-context learning (ICL) capabilities. While not yet fully understood, ICL has already proved to be an intriguing phenomenon, allowing transformers to learn in context -- without requiring further training. In this paper, we further advance the understanding of ICL by demonstrating that transformers can perform full Bayesian inference for commonly used statistical models in context. More specifically, we introduce a general framework that builds on ideas from prior fitted networks and continuous normalizing flows which enables us to infer complex posterior distributions for methods such as generalized linear models and latent factor models. Extensive experiments on real-world datasets demonstrate that our ICL approach yields posterior samples that are similar in quality to state-of-the-art MCMC or variational inference methods not operating in context.

artificial intelligence, bayesian inference, machine learning, (13 more...)

2501.16825

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > New York (0.04)
North America > United States > Michigan (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Mohr, Felix, van Rijn, Jan N.

Learning Curves for Decision Making in Supervised Machine Learning: A Survey

arXiv.org Artificial IntelligenceJan-28-2025

Learning curves are a concept from social sciences that has been adopted in the context of machine learning to assess the performance of a learning algorithm with respect to a certain resource, e.g., the number of training examples or the number of training iterations. Learning curves have important applications in several machine learning contexts, most notably in data acquisition, early stopping of model training, and model selection. For instance, learning curves can be used to model the performance of the combination of an algorithm and its hyperparameter configuration, providing insights into their potential suitability at an early stage and often expediting the algorithm selection process. Various learning curve models have been proposed to use learning curves for decision making. Some of these models answer the binary decision question of whether a given algorithm at a certain budget will outperform a certain reference performance, whereas more complex models predict the entire learning curve of an algorithm. We contribute a framework that categorises learning curve approaches using three criteria: the decision-making situation they address, the intrinsic learning curve question they answer and the type of resources they use. We survey papers from the literature and classify them into this framework.

artificial intelligence, learner, machine learning, (18 more...)

doi: 10.1007/s10994-024-06619-7

2201.1215

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
Asia > Middle East > Jordan (0.04)
South America > Colombia > Cundinamarca Department (0.04)
North America > United States > Wisconsin (0.04)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Education (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)
Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(3 more...)

Neural Information Processing SystemsJan-27-2025, 13:08:06 GMT

Review for NeurIPS paper: Adaptive Experimental Design with Temporal Interference: A Maximum Likelihood Approach

Weaknesses: - Can we interpret the results as follows: If the TAR assumption is satisfied with positive limits, and we use MLE, then temporal interference does not cause bias. If this interpretation is correct, then it would be illuminating if the authors provide the intuitive connection between the TAR assumption and temporal interference. It is not clear if the estimations that the authors have required are feasible if the state space is large. The next natural question is how robust the results are if we use other methods for estimation. This could have been shown by providing some simulations, which is a part missing from the manuscript.

adaptive experimental design, maximum likelihood approach, temporal interference, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)