AITopics

2409.02668

Genre:

Workflow (1.00)
Summary/Review (1.00)
Instructional Material (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
(6 more...)

arXiv.org Artificial IntelligenceSep-4-2024

Non-stationary and Sparsely-correlated Multi-output Gaussian Process with Spike-and-Slab Prior

Xinming, Wang, Yongxiang, Li, Xiaowei, Yue, Jianguo, Wu

Multi-output Gaussian process (MGP) is commonly used as a transfer learning method to leverage information among multiple outputs. A key advantage of MGP is providing uncertainty quantification for prediction, which is highly important for subsequent decision-making tasks. However, traditional MGP may not be sufficiently flexible to handle multivariate data with dynamic characteristics, particularly when dealing with complex temporal correlations. Additionally, since some outputs may lack correlation, transferring information among them may lead to negative transfer. To address these issues, this study proposes a non-stationary MGP model that can capture both the dynamic and sparse correlation among outputs. Specifically, the covariance functions of MGP are constructed using convolutions of time-varying kernel functions. Then a dynamic spike-and-slab prior is placed on correlation parameters to automatically decide which sources are informative to the target output in the training process. An expectation-maximization (EM) algorithm is proposed for efficient model fitting. Both numerical studies and a real case demonstrate its efficacy in capturing dynamic and sparse correlation structure and mitigating negative transfer for high-dimensional time-series data. Finally, a mountain-car reinforcement learning case highlights its potential application in decision making problems.

artificial intelligence, inform journal, machine learning, (17 more...)

2409.03149

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
(2 more...)

Straub, Dominik, Niehues, Tobias F., Peters, Jan, Rothkopf, Constantin A.

Inverse decision-making using neural amortized Bayesian actors

arXiv.org Artificial IntelligenceSep-4-2024

Bayesian observer and actor models have provided normative explanations for many behavioral phenomena in perception, sensorimotor control, and other areas of cognitive science and neuroscience. They attribute behavioral variability and biases to different interpretable entities such as perceptual and motor uncertainty, prior beliefs, and behavioral costs. However, when extending these models to more complex tasks with continuous actions, solving the Bayesian decision-making problem is often analytically intractable. Moreover, inverting such models to perform inference over their parameters given behavioral data is computationally even more difficult. Therefore, researchers typically constrain their models to easily tractable components, such as Gaussian distributions or quadratic cost functions, or resort to numerical methods. To overcome these limitations, we amortize the Bayesian actor using a neural network trained on a wide range of different parameter settings in an unsupervised fashion. Using the pre-trained neural network enables performing gradient-based Bayesian inference of the Bayesian actor model's parameters. We show on synthetic data that the inferred posterior distributions are in close alignment with those obtained using analytical solutions where they exist. Where no analytical solution is available, we recover posterior distributions close to the ground truth. We then show that identifiability problems between priors and costs can arise in more complex cost functions. Finally, we apply our method to empirical data and show that it explains systematic individual differences of behavioral patterns.

cost function, neural network, optimal action, (12 more...)

2409.0371

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningSep-3-2024

A sparse PAC-Bayesian approach for high-dimensional quantile prediction

Mai, The Tien

Quantile regression, a robust method for estimating conditional quantiles, has advanced significantly in fields such as econometrics, statistics, and machine learning. In high-dimensional settings, where the number of covariates exceeds sample size, penalized methods like lasso have been developed to address sparsity challenges. Bayesian methods, initially connected to quantile regression via the asymmetric Laplace likelihood, have also evolved, though issues with posterior variance have led to new approaches, including pseudo/score likelihoods. This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction. It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation. The method demonstrates strong theoretical guarantees, through PAC-Bayes bounds, that establish non-asymptotic oracle inequalities, showing minimax-optimal prediction error and adaptability to unknown sparsity. Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.

mse 0, probability, quantile regression, (15 more...)

2409.01687

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Vélez-Cruz, Nayely, Laubichler, Manfred D.

Bayesian Learning in a Nonlinear Multiscale State-Space Model

arXiv.org Machine LearningSep-3-2024

In many biological systems, the developmental processes of individuals play a crucial role in shaping the traits, characteristics, and growth patterns of subsequent generations. Throughout various stages of growth and maturation, organisms undergo significant changes that impact their overall fitness and reproductive success. These developmental stages, ranging from early cellular differentiation to reproductive maturity, each contribute uniquely to the organism's ability to survive and transmit biological information to offspring. Conversely, hereditary processes also influence the developmental stages of subsequent generations, creating a feedback loop where the heritable traits and adaptations of individuals as well as their health statuses such as disease resistance, metabolic efficiency, or physiological robustness can impact the developmental trajectories of future generations. This feedback loop between developmental processes and heredity continually shapes evolutionary trajectories, driving adaptation and resilience in populations over time.

eptember 4, fine time scale trajectory, trajectory, (14 more...)

2408.06425

Country:

North America > United States > Arizona > Maricopa County > Tempe (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Consumer Health (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Collis, Poppy, Singh, Ryan, Kinghorn, Paul F, Buckley, Christopher L

Learning in Hybrid Active Inference Models

arXiv.org Artificial IntelligenceSep-2-2024

An open problem in artificial intelligence is how systems can flexibly learn discrete abstractions that are useful for solving inherently continuous problems. Previous work in computational neuroscience has considered this functional integration of discrete and continuous variables during decision-making under the formalism of active inference (Parr, Friston & de Vries, 2017; Parr & Friston, 2018). However, their focus is on the expressive physical implementation of categorical decisions and the hierarchical mixed generative model is assumed to be known. As a consequence, it is unclear how this framework might be extended to learning. We therefore present a novel hierarchical hybrid active inference agent in which a high-level discrete active inference planner sits above a low-level continuous active inference controller. We make use of recent work in recurrent switching linear dynamical systems (rSLDS) which implement end-to-end learning of meaningful discrete representations via the piecewise linear decomposition of complex continuous dynamics (Linderman et al., 2016). The representations learned by the rSLDS inform the structure of the hybrid decision-making agent and allow us to (1) specify temporally-abstracted sub-goals in a method reminiscent of the options framework, (2) lift the exploration into discrete space allowing us to exploit information-theoretic exploration bonuses and (3) `cache' the approximate solutions to low-level problems in the discrete planner. We apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and successful planning through the delineation of abstract sub-goals.

approximation, controller, learning, (12 more...)

2409.01066

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > East Sussex > Brighton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.54)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
(2 more...)

Mootoovaloo, Arrykrishna, García-García, Carlos, Alonso, David, Ruiz-Zapatero, Jaime

$\mathtt{emuflow}$: Normalising Flows for Joint Cosmological Analysis

arXiv.org Artificial IntelligenceSep-2-2024

Given the growth in the variety and precision of astronomical datasets of interest for cosmology, the best cosmological constraints are invariably obtained by combining data from different experiments. At the likelihood level, one complication in doing so is the need to marginalise over large-dimensional parameter models describing the data of each experiment. These include both the relatively small number of cosmological parameters of interest and a large number of "nuisance" parameters. Sampling over the joint parameter space for multiple experiments can thus become a very computationally expensive operation. This can be significantly simplified if one could sample directly from the marginal cosmological posterior distribution of preceding experiments, depending only on the common set of cosmological parameters. In this paper, we show that this can be achieved by emulating marginal posterior distributions via normalising flows. The resulting trained normalising flow models can be used to efficiently combine cosmological constraints from independent datasets without increasing the dimensionality of the parameter space under study. We show that the method is able to accurately describe the posterior distribution of real cosmological datasets, as well as the joint distribution of different datasets, even when significant tension exists between experiments. The resulting joint constraints can be obtained in a fraction of the time it would take to combine the same datasets at the level of their likelihoods. We construct normalising flow models for a set of public cosmological datasets of general interests and make them available, together with the software used to train them, and to exploit them in cosmological parameter inference.

experiment, flow model, posterior, (16 more...)

2409.01407

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningSep-2-2024

Stein transport for Bayesian inference

Nüsken, Nikolas

We introduce $\textit{Stein transport}$, a novel methodology for Bayesian inference designed to efficiently push an ensemble of particles along a predefined curve of tempered probability distributions. The driving vector field is chosen from a reproducing kernel Hilbert space and can be derived either through a suitable kernel ridge regression formulation or as an infinitesimal optimal transport map in the Stein geometry. The update equations of Stein transport resemble those of Stein variational gradient descent (SVGD), but introduce a time-varying score function as well as specific weights attached to the particles. While SVGD relies on convergence in the long-time limit, Stein transport reaches its posterior approximation at finite time $t=1$. Studying the mean-field limit, we discuss the errors incurred by regularisation and finite-particle effects, and we connect Stein transport to birth-death dynamics and Fisher-Rao gradient flows. In a series of experiments, we show that in comparison to SVGD, Stein transport not only often reaches more accurate posterior approximations with a significantly reduced computational budget, but that it also effectively mitigates the variance collapse phenomenon commonly observed in SVGD.

bayesian inference, stein transport

2409.01464

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.60)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

arXiv.org Artificial IntelligenceSep-2-2024

Dataset Distillation from First Principles: Integrating Core Information Extraction and Purposeful Learning

Kungurtsev, Vyacheslav, Peng, Yuanfang, Gu, Jianyang, Vahidian, Saeed, Quinn, Anthony, Idlahcen, Fadwa, Chen, Yiran

Dataset distillation (DD) is an increasingly important technique that focuses on constructing a synthetic dataset capable of capturing the core information in training data to achieve comparable performance in models trained on the latter. While DD has a wide range of applications, the theory supporting it is less well evolved. New methods of DD are compared on a common set of benchmarks, rather than oriented towards any particular learning task. In this work, we present a formal model of DD, arguing that a precise characterization of the underlying optimization problem must specify the inference task associated with the application of interest. Without this task-specific focus, the DD problem is under-specified, and the selection of a DD algorithm for a particular task is merely heuristic. Our formalization reveals novel applications of DD across different modeling environments. We analyze existing DD methods through this broader lens, highlighting their strengths and limitations in terms of accuracy and faithfulness to optimal DD operation. Finally, we present numerical results for two case studies important in contemporary settings. Firstly, we address a critical challenge in medical data analysis: merging the knowledge from different datasets composed of intersecting, but not identical, sets of features, in order to construct a larger dataset in what is usually a small sample setting. Secondly, we consider out-of-distribution error across boundary conditions for physics-informed neural networks (PINNs), showing the potential for DD to provide more physically faithful data. By establishing this general formulation of DD, we aim to establish a new research paradigm by which DD can be understood and from which new DD techniques can arise.

dataset, distillation, optimization problem, (14 more...)

2409.0141

Country:

Europe > Czechia > Prague (0.04)
North America > United States > Ohio (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report (0.82)
Overview (0.66)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

arXiv.org Artificial IntelligenceSep-1-2024

DAMe: Personalized Federated Social Event Detection with Dual Aggregation Mechanism

Yu, Xiaoyan, Wei, Yifan, Li, Pu, Zhou, Shuaishuai, Peng, Hao, Sun, Li, Zhu, Liehuang, Yu, Philip S.

Training social event detection models through federated learning (FedSED) aims to improve participants' performance on the task. However, existing federated learning paradigms are inadequate for achieving FedSED's objective and exhibit limitations in handling the inherent heterogeneity in social data. This paper proposes a personalized federated learning framework with a dual aggregation mechanism for social event detection, namely DAMe. We present a novel local aggregation strategy utilizing Bayesian optimization to incorporate global knowledge while retaining local characteristics. Moreover, we introduce a global aggregation strategy to provide clients with maximum external knowledge of their preferences. In addition, we incorporate a global-local event-centric constraint to prevent local overfitting and ``client-drift''. Experiments within a realistic simulation of a natural federated setting, utilizing six social event datasets spanning six languages and two social media platforms, along with an ablation study, have demonstrated the effectiveness of the proposed framework. Further robustness analyses have shown that DAMe is resistant to injection attacks.

dame, global model, social event detection, (12 more...)

2409.00614

Country:

North America > United States > Idaho > Ada County > Boise (0.05)
Asia > China > Beijing > Beijing (0.05)
Asia > China > Yunnan Province > Kunming (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Social Events (1.00)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)