AITopics

2410.01772

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(8 more...)

Genre:

Research Report (1.00)
Financial News (1.00)

Industry:

Banking & Finance > Trading (1.00)
Information Technology (0.93)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceOct-2-2024

Bayesian Binary Search

Singh, Vikash, Khanzadeh, Matthew, Davis, Vincent, Rush, Harrison, Rossi, Emanuele, Shrader, Jesse, Lio, Pietro

BBS leverages machine learning/statistical techniques to estimate the probability density of the search space and modifies the bisection step to split based on probability density rather than the traditional midpoint, allowing for the learned distribution of the search space to guide the search algorithm. Search space density estimation can flexibly be performed using supervised probabilistic machine learning techniques (e.g., Gaussian process regression, Bayesian neural networks, quantile regression) or unsupervised learning algorithms (e.g., Gaussian mixture models, kernel density estimation (KDE), maximum likelihood estimation (MLE)). We demonstrate significant efficiency gains of using BBS on both simulated data across a variety of distributions and in a real-world binary search use case of probing channel balances in the Bitcoin Lightning Network, for which we have deployed the BBS algorithm in a production setting. The concept of organizing data for efficient searching has ancient roots. One of the earliest known examples is the Inakibit-Anu tablet from Babylon (c. Similar sorting techniques were evident in name lists discovered on the Aegean Islands.

algorithm, binary search, search space, (16 more...)

2410.01771

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Giroux, James, Fanelli, Cristiano

Uncertainty Quantification with Bayesian Higher Order ReLU KANs

arXiv.org Artificial IntelligenceOct-2-2024

We introduce the first method of uncertainty quantification in the domain of Kolmogorov-Arnold Networks, specifically focusing on (Higher Order) ReLUKANs to enhance computational efficiency given the computational demands of Bayesian methods. The method we propose is general in nature, providing access to both epistemic and aleatoric uncertainties. It is also capable of generalization to other various basis functions. We validate our method through a series of closure tests, including simple one-dimensional functions and application to the domain of (Stochastic) Partial Differential Equations. Referring to the latter, we demonstrate the method's ability to correctly identify functional dependencies introduced through the inclusion of a stochastic term. The code supporting this work can be found at https://github.com/wmdataphys/Bayesian-HR-KAN

epistemic uncertainty, equation, kolmogorov-arnold network, (16 more...)

2410.01687

Country:

North America > United States > Virginia > Williamsburg (0.04)
Europe > Latvia > Riga Municipality > Riga (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

arXiv.org Machine LearningOct-2-2024

A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models

Kumar, Shivam, Yang, Yun, Lin, Lizhen

In this work, we explore the theoretical properties of conditional deep generative models under the statistical framework of distribution regression where the response variable lies in a high-dimensional ambient space but concentrates around a potentially lower-dimensional manifold. More specifically, we study the large-sample properties of a likelihood-based approach for estimating these models. Our results lead to the convergence rate of a sieve maximum likelihood estimator (MLE) for estimating the conditional distribution (and its devolved counterpart) of the response given predictors in the Hellinger (Wasserstein) metric. Our rates depend solely on the intrinsic dimension and smoothness of the true conditional distribution. These findings provide an explanation of why conditional deep generative models can circumvent the curse of dimensionality from the perspective of statistical foundations and demonstrate that they can learn a broader class of nearly singular conditional distributions. Our analysis also emphasizes the importance of introducing a small noise perturbation to the data when they are supported sufficiently close to a manifold. Finally, in our numerical studies, we demonstrate the effective implementation of the proposed approach using both synthetic and real-world datasets, which also provide complementary validation to our theoretical findings.

deep generative model, estimation, generative model, (16 more...)

arXiv.org Machine Learning

2410.02025

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.81)

Müller, Samuel, Hollmann, Noah, Hutter, Frank

Bayes' Power for Explaining In-Context Learning Generalizations

arXiv.org Machine LearningOct-2-2024

Traditionally, neural network training has been primarily viewed as an approximation of maximum likelihood estimation (MLE). This interpretation originated in a time when training for multiple epochs on small datasets was common and performance was data bound; but it falls short in the era of large-scale single-epoch trainings ushered in by large self-supervised setups, like language models. In this new setup, performance is compute-bound, but data is readily available. As models became more powerful, in-context learning (ICL), i.e., learning in a single forward-pass based on the context, emerged as one of the dominant paradigms. In this paper, we argue that a more useful interpretation of neural network behavior in this era is as an approximation of the true posterior, as defined by the data-generating process. We demonstrate this interpretations' power for ICL and its usefulness to predict generalizations to previously unseen tasks. We show how models become robust in-context learners by effectively composing knowledge from their training data. We illustrate this with experiments that reveal surprising generalizations, all explicable through the exact posterior. Finally, we show the inherent constraints of the generalization capabilities of posteriors and the limitations of neural networks in approximating these posteriors.

neural network, posterior, prediction, (13 more...)

arXiv.org Machine Learning

2410.01565

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Germany > Baden-Württemberg > Freiburg (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Gahlot, Abhinav Prakash, Orozco, Rafael, Yin, Ziyi, Herrmann, Felix J.

An uncertainty-aware Digital Shadow for underground multimodal CO2 storage monitoring

Geological Carbon Storage GCS is arguably the only scalable net-negative CO2 emission technology available While promising subsurface complexities and heterogeneity of reservoir properties demand a systematic approach to quantify uncertainty when optimizing production and mitigating storage risks which include assurances of Containment and Conformance of injected supercritical CO2 As a first step towards the design and implementation of a Digital Twin for monitoring underground storage operations a machine learning based data-assimilation framework is introduced and validated on carefully designed realistic numerical simulations As our implementation is based on Bayesian inference but does not yet support control and decision-making we coin our approach an uncertainty-aware Digital Shadow To characterize the posterior distribution for the state of CO2 plumes conditioned on multi-modal time-lapse data the envisioned Shadow combines techniques from Simulation-Based Inference SBI and Ensemble Bayesian Filtering to establish probabilistic baselines and assimilate multi-modal data for GCS problems that are challenged by large degrees of freedom nonlinear multi-physics non-Gaussianity and computationally expensive to evaluate fluid flow and seismic simulations To enable SBI for dynamic systems a recursive scheme is proposed where the Digital Shadows neural networks are trained on simulated ensembles for their state and observed data well and/or seismic Once training is completed the systems state is inferred when time-lapse field data becomes available In this computational study we observe that a lack of knowledge on the permeability field can be factored into the Digital Shadows uncertainty quantification To our knowledge this work represents the first proof of concept of an uncertainty-aware in-principle scalable Digital Shadow.

artificial intelligence, co 2, machine learning, (17 more...)

2410.01218

Country:

Asia > Middle East > Israel > Mediterranean Sea (0.24)
Atlantic Ocean > North Sea (0.14)
Europe > United Kingdom (0.14)
(2 more...)

Genre: Research Report (0.82)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Binary Cycle Geothermal Power Plant > Supercritical CO2 Geothermal Power Plant (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Scientific Computing (1.00)
Information Technology > Modeling & Simulation (1.00)
(3 more...)

Exploring the Learning Capabilities of Language Models using LEVERWORLDS

Wagner, Eitan, Feder, Amir, Abend, Omri

Learning a model of a stochastic setting often involves learning both general structure rules and specific properties of the instance. This paper investigates the interplay between learning the general and the specific in various learning methods, with emphasis on sample efficiency. We design a framework called {\sc LeverWorlds}, which allows the generation of simple physics-inspired worlds that follow a similar generative process with different distributions, and their instances can be expressed in natural language. These worlds allow for controlled experiments to assess the sample complexity of different learning methods. We experiment with classic learning algorithms as well as Transformer language models, both with fine-tuning and In-Context Learning (ICL). Our general finding is that (1) Transformers generally succeed in the task; but (2) they are considerably less sample efficient than classic methods that make stronger assumptions about the structure, such as Maximum Likelihood Estimation and Logistic Regression. This finding is in tension with the recent tendency to use Transformers as general-purpose estimators. We propose an approach that leverages the ICL capabilities of contemporary language models to apply simple algorithms for this type of data. Our experiments show that models currently struggle with the task but show promising potential.

computational linguistic, experiment, language model, (14 more...)

2410.00519

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre:

Research Report > New Finding (0.90)
Research Report > Experimental Study (0.90)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Viyuela, Oscar Gil, Sanfeliu, Alberto

Human-Robot Collaborative Minimum Time Search through Sub-priors in Ant Colony Optimization

Human-Robot Collaboration (HRC) has evolved into a highly promising issue owing to the latest breakthroughs in Artificial Intelligence (AI) and Human-Robot Interaction (HRI), among other reasons. This emerging growth increases the need to design multi-agent algorithms that can manage also human preferences. This paper presents an extension of the Ant Colony Optimization (ACO) meta-heuristic to solve the Minimum Time Search (MTS) task, in the case where humans and robots perform an object searching task together. The proposed model consists of two main blocks. The first one is a convolutional neural network (CNN) that provides the prior probabilities about where an object may be from a segmented image. The second one is the Sub-prior MTS-ACO algorithm (SP-MTS-ACO), which takes as inputs the prior probabilities and the particular search preferences of the agents in different sub-priors to generate search plans for all agents. The model has been tested in real experiments for the joint search of an object through a Vizanti web-based visualization in a tablet computer. The designed interface allows the communication between a human and our humanoid robot named IVO. The obtained results show an improvement in the search perception of the users without loss of efficiency.

agent, interface, participant, (13 more...)

doi: 10.1109/LRA.2024.3471451

2410.00517

Country:

North America > United States > New York > New York County > New York City (0.04)
South America > Uruguay > Artigas > Artigas (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

Enhancing Solution Efficiency in Reinforcement Learning: Leveraging Sub-GFlowNet and Entropy Integration

He, Siyi

Traditional reinforcement learning often struggles to generate diverse, high-reward solutions, especially in domains like drug design and black-box function optimization. Markov Chain Monte Carlo (MCMC) methods provide an alternative method of RL in candidate selection but suffer from high computational costs and limited candidate diversity exploration capabilities. In response, GFlowNet, a novel neural network architecture, was introduced to model complex system dynamics and generate diverse high-reward trajectories. To further enhance this approach, this paper proposes improvements to GFlowNet by introducing a new loss function and refining the training objective associated with sub-GFlowNet. These enhancements aim to integrate entropy and leverage network structure characteristics, improving both candidate diversity and computational efficiency. We demonstrated the superiority of the refined GFlowNet over traditional methods by empirical results from hypergrid experiments and molecule synthesis tasks. The findings underscore the effectiveness of incorporating entropy and exploiting network structure properties in solution generation in molecule synthesis as well as diverse experimental designs.

gflownet, loss function, trajectory, (15 more...)

2410.00461

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

arXiv.org Artificial IntelligenceSep-30-2024

Demonstrating the Continual Learning Capabilities and Practical Application of Discrete-Time Active Inference

Prakki, Rithvik

Active inference is a mathematical framework for understanding how agents (biological or artificial) interact with their environments, enabling continual adaptation and decision-making. It combines Bayesian inference and free energy minimization to model perception, action, and learning in uncertain and dynamic contexts. Unlike reinforcement learning, active inference integrates exploration and exploitation seamlessly by minimizing expected free energy. In this paper, we present a continual learning framework for agents operating in discrete time environments, using active inference as the foundation. We derive the mathematical formulations of variational and expected free energy and apply them to the design of a self-learning research agent. This agent updates its beliefs and adapts its actions based on new data without manual intervention. Through experiments in changing environments, we demonstrate the agent's ability to relearn and refine its models efficiently, making it suitable for complex domains like finance and healthcare. The paper concludes by discussing how the proposed framework generalizes to other systems, positioning active inference as a flexible approach for adaptive AI.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2410.0024

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.89)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)