AITopics

2304.06729

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Peng, Yifan, Chen, Yian, Stoudenmire, E. Miles, Khoo, Yuehaw

Generative Modeling via Hierarchical Tensor Sketching

arXiv.org Artificial IntelligenceApr-11-2023

We propose a hierarchical tensor-network approach for approximating high-dimensional probability density via empirical distribution. This leverages randomized singular value decomposition (SVD) techniques and involves solving linear equations for tensor cores in this tensor network. The complexity of the resulting algorithm scales linearly in the dimension of the high-dimensional density. An analysis of estimation error demonstrates the effectiveness of this method through several numerical experiments.

artificial intelligence, ising model, machine learning, (17 more...)

2304.05305

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Goldblum, Micah, Finzi, Marc, Rowan, Keefer, Wilson, Andrew Gordon

The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning

arXiv.org Artificial IntelligenceApr-11-2023

No free lunch theorems for supervised learning state that no learner can solve all problems or that all learners achieve exactly the same accuracy on average over a uniform distribution on learning problems. Accordingly, these theorems are often referenced in support of the notion that individual problems require specially tailored inductive biases. While virtually all uniformly sampled datasets have high complexity, real-world problems disproportionately generate low-complexity data, and we argue that neural network models share this same preference, formalized using Kolmogorov complexity. Notably, we show that architectures designed for a particular domain, such as computer vision, can compress datasets on a variety of seemingly unrelated domains. Our experiments show that pre-trained and even randomly initialized language models prefer to generate low-complexity sequences. Whereas no free lunch theorems seemingly indicate that individual problems require specialized learners, we explain how tasks that often require human intervention such as picking an appropriately sized model when labeled data is scarce or plentiful can be automated into a single learning algorithm. These observations justify the trend in deep learning of unifying seemingly disparate problems with an increasingly small set of machine learning models.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2304.05366

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
(2 more...)

arXiv.org Artificial IntelligenceApr-11-2023

Habits and goals in synergy: a variational Bayesian framework for behavior

Han, Dongqi, Doya, Kenji, Li, Dongsheng, Tani, Jun

How to behave efficiently and flexibly is a central problem for understanding biological agents and creating intelligent embodied AI. It has been well known that behavior can be classified as two types: reward-maximizing habitual behavior, which is fast while inflexible; and goal-directed behavior, which is flexible while slow. Conventionally, habitual and goal-directed behaviors are considered handled by two distinct systems in the brain. Here, we propose to bridge the gap between the two behaviors, drawing on the principles of variational Bayesian theory. We incorporate both behaviors in one framework by introducing a Bayesian latent variable called "intention". The habitual behavior is generated by using prior distribution of intention, which is goal-less; and the goal-directed behavior is generated by the posterior distribution of intention, which is conditioned on the goal. Building on this idea, we present a novel Bayesian framework for modeling behaviors. Our proposed framework enables skill sharing between the two kinds of behaviors, and by leveraging the idea of predictive coding, it enables an agent to seamlessly generalize from habitual to goal-directed behavior without requiring additional training. The proposed framework suggests a fresh perspective for cognitive science and embodied AI, highlighting the potential for greater integration between habitual and goal-directed behaviors.

habitual behavior, machine learning, natural language, (20 more...)

2304.05008

Country:

Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Asia > Macao (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Leisure & Entertainment (0.92)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Neural Information Processing SystemsApr-10-2023, 13:47:19 GMT

Bayesian Dark Knowledge

We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities p(y|x, D), e.g., for applications involving bandits or active learning. One simple approach to this is to use online Monte Carlo methods, such as SGLD (stochastic gradient Langevin dynamics). Unfortunately, such a method needs to store many copies of the parameters (which wastes memory), and needs to make predictions using many versions of the model (which wastes time). We describe a method for "distilling" a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network. We compare to two very recent approaches to Bayesian neural networks, namely an approach based on expectation propagation [HLA15] and an approach based on variational Bayes [BCKW15]. Our method performs better than both of these, is much simpler to implement, and uses less computation at test time.

approximation, neural network, sgld, (17 more...)

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Industry: Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Neural Information Processing SystemsApr-10-2023, 12:41:35 GMT

Linear Response Methods for Accurate Covariance Estimates from Mean Field Variational Bayes

Mean field variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, a well known major failing of MFVB is that it underestimates the uncertainty of model variables (sometimes severely) and provides no information about model variable covariance. We generalize linear response methods from statistical physics to deliver accurate uncertainty estimates for model variables--both for individual variables and coherently across variables. We call our method linear response variational Bayes (LRVB). When the MFVB posterior approximation is in the exponential family, LRVB has a simple, analytic form, even for non-conjugate models. Indeed, we make no assumptions about the form of the true posterior. We demonstrate the accuracy and scalability of our method on a range of models for both simulated and real data.

covariance estimate, mfvb, posterior, (13 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Neural Information Processing SystemsApr-10-2023, 12:23:00 GMT

Automatic Variational Inference in Stan Alp Kucukelbir

Variational inference is a scalable technique for approximate Bayesian inference. Deriving variational inference algorithms requires tedious model-specific calculations; this makes it difficult for non-experts to use.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsApr-10-2023, 09:03:10 GMT

Fast ɛ-free Inference of Simulation Models with Bayesian Conditional Density Estimation

Many statistical models can be simulated forwards but have intractable likelihoods. Approximate Bayesian Computation (ABC) methods are used to infer properties of these models from data. Traditionally these methods approximate the posterior over parameters by conditioning on data being inside an ɛ-ball around the observed data, which is only correct in the limit ɛ 0. Monte Carlo methods can then draw samples from the approximate posterior to approximate predictions or error bars on parameters.

mdn, posterior, simulation, (13 more...)

Country: Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Diao, Michael, Balasubramanian, Krishnakumar, Chewi, Sinho, Salim, Adil

Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein Space

arXiv.org Artificial IntelligenceApr-10-2023

Variational inference (VI) seeks to approximate a target distribution $\pi$ by an element of a tractable family of distributions. Of key interest in statistics and machine learning is Gaussian VI, which approximates $\pi$ by minimizing the Kullback-Leibler (KL) divergence to $\pi$ over the space of Gaussians. In this work, we develop the (Stochastic) Forward-Backward Gaussian Variational Inference (FB-GVI) algorithm to solve Gaussian VI. Our approach exploits the composite structure of the KL divergence, which can be written as the sum of a smooth term (the potential) and a non-smooth term (the entropy) over the Bures-Wasserstein (BW) space of Gaussians endowed with the Wasserstein distance. For our proposed algorithm, we obtain state-of-the-art convergence guarantees when $\pi$ is log-smooth and log-concave, as well as the first convergence guarantees to first-order stationary solutions when $\pi$ is only log-smooth.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2304.05398

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Wakayama, Shohei, Ahmed, Nisar

Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing

arXiv.org Artificial IntelligenceApr-10-2023

Humans cannot always be treated as oracles for collaborative sensing. Robots thus need to maintain beliefs over unknown world states when receiving semantic data from humans, as well as account for possible discrepancies between human-provided data and these beliefs. To this end, this paper introduces the problem of semantic data association (SDA) in relation to conventional data association problems for sensor fusion. It then develops a novel probabilistic semantic data association (PSDA) algorithm to rigorously address SDA in general settings, unlike previous work on semantic data fusion which developed heuristic techniques for specific settings. PSDA is further incorporated into a recursive hybrid Bayesian data fusion scheme which uses Gaussian mixture priors for object states and softmax functions for semantic human sensor data likelihoods. Simulations of a multi-object search task show that PSDA enables robust collaborative state estimation under a wide range of conditions where semantic human sensor data can be erroneous or contain significant reference ambiguities.

artificial intelligence, machine learning, posterior, (17 more...)

doi: 10.1109/TRO.2023.3262111

2110.09621

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Asia > Japan > Honshū > Kansai > Wakayama Prefecture > Wakayama (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(2 more...)