AITopics

doi: 10.1039/D3DD00002H

2302.00615

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Quebec > Montreal (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(5 more...)

Genre: Research Report (0.92)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(3 more...)

arXiv.org Artificial IntelligenceJun-27-2023

GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

Murata, Naoki, Saito, Koichi, Lai, Chieh-Hsin, Takida, Yuhta, Uesaka, Toshimitsu, Mitsufuji, Yuki, Ermon, Stefano

Pre-trained diffusion models have been successfully used as priors in a variety of linear inverse problems, where the goal is to reconstruct a signal from noisy linear measurements. However, existing approaches require knowledge of the linear operator. In this paper, we propose GibbsDDRM, an extension of Denoising Diffusion Restoration Models (DDRM) to a blind setting in which the linear measurement operator is unknown. GibbsDDRM constructs a joint distribution of the data, measurements, and linear operator by using a pre-trained diffusion model for the data prior, and it solves the problem by posterior sampling with an efficient variant of a Gibbs sampler. The proposed method is problem-agnostic, meaning that a pre-trained diffusion model can be applied to various inverse problems without fine-tuning. In experiments, it achieved high performance on both blind image deblurring and vocal dereverberation tasks, despite the use of simple generic priors for the underlying linear operators.

artificial intelligence, gibbsddrm, machine learning, (12 more...)

2301.12686

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Zhang, Wenyu, Meyer, Florian

Multisensor Multiobject Tracking With High-Dimensional Object States

arXiv.org Artificial IntelligenceJun-27-2023

Passive monitoring of acoustic or radio sources has important applications in modern convenience, public safety, and surveillance. A key task in passive monitoring is multiobject tracking (MOT). This paper presents a Bayesian method for multisensor MOT for challenging tracking problems where the object states are high-dimensional, and the measurements follow a nonlinear model. Our method is developed in the framework of factor graphs and the sum-product algorithm (SPA) and implemented using random samples or "particles". The multimodal probability density functions (pdfs) provided by the SPA are effectively represented by a Gaussian mixture model (GMM). To perform the operations of the SPA in high-dimensional spaces, we make use of Particle flow (PFL). Here, particles are migrated towards regions of high likelihood based on the solution of a partial differential equation. This makes it possible to obtain good object detection and tracking performance even in challenging multisensor MOT scenarios with single sensor measurements that have a lower dimension than the object positions. We perform a numerical evaluation in a passive acoustic monitoring scenario where multiple sources are tracked in 3-D from 1-D time-difference-of-arrival (TDOA) measurements provided by pairs of hydrophones. Our numerical results demonstrate favorable detection and estimation accuracy compared to state-of-the-art reference techniques.

artificial intelligence, machine learning, particle, (19 more...)

2212.14556

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Oladyshkin, Sergey, Praditia, Timothy, Kröker, Ilja, Mohammadi, Farid, Nowak, Wolfgang, Otte, Sebastian

The Deep Arbitrary Polynomial Chaos Neural Network or how Deep Artificial Neural Networks could benefit from Data-Driven Homogeneous Chaos Theory

arXiv.org Machine LearningJun-26-2023

Artificial Intelligence and Machine learning have been widely used in various fields of mathematical computing, physical modeling, computational science, communication science, and stochastic analysis. Approaches based on Deep Artificial Neural Networks (DANN) are very popular in our days. Depending on the learning task, the exact form of DANNs is determined via their multi-layer architecture, activation functions and the so-called loss function. However, for a majority of deep learning approaches based on DANNs, the kernel structure of neural signal processing remains the same, where the node response is encoded as a linear superposition of neural activity, while the non-linearity is triggered by the activation functions. In the current paper, we suggest to analyze the neural signal processing in DANNs from the point of view of homogeneous chaos theory as known from polynomial chaos expansion (PCE). From the PCE perspective, the (linear) response on each node of a DANN could be seen as a $1^{st}$ degree multi-variate polynomial of single neurons from the previous layer, i.e. linear weighted sum of monomials. From this point of view, the conventional DANN structure relies implicitly (but erroneously) on a Gaussian distribution of neural signals. Additionally, this view revels that by design DANNs do not necessarily fulfill any orthogonality or orthonormality condition for a majority of data-driven applications. Therefore, the prevailing handling of neural signals in DANNs could lead to redundant representation as any neural signal could contain some partial information from other neural signals. To tackle that challenge, we suggest to employ the data-driven generalization of PCE theory known as arbitrary polynomial chaos (aPC) to construct a corresponding multi-variate orthonormal representations on each node of a DANN to obtain Deep arbitrary polynomial chaos neural networks.

artificial intelligence, dapc nn, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.neunet.2023.06.036

2306.14753

Country:

Europe > Germany > Baden-Württemberg (0.14)
Europe > United Kingdom > England (0.14)
North America > United States > Wisconsin (0.14)
(3 more...)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Guth, Stephen, Mojahed, Alireza, Sapsis, Themistoklis P.

Evaluation of machine learning architectures on the quantification of epistemic and aleatoric uncertainties in complex dynamical systems

Machine learning methods for the construction of data-driven reduced order model models are used in an increasing variety of engineering domains, especially as a supplement to expensive computational fluid dynamics for design problems. An important check on the reliability of surrogate models is Uncertainty Quantification (UQ), a self assessed estimate of the model error. Accurate UQ allows for cost savings by reducing both the required size of training data sets and the required safety factors, while poor UQ prevents users from confidently relying on model predictions. We examine several machine learning techniques, including both Gaussian processes and a family UQ-augmented neural networks: Ensemble neural networks (ENN), Bayesian neural networks (BNN), Dropout neural networks (D-NN), and Gaussian neural networks (G-NN). We evaluate UQ accuracy (distinct from model accuracy) using two metrics: the distribution of normalized residuals on validation data, and the distribution of estimated uncertainties. We apply these metrics to two model data sets, representative of complex dynamical systems: an ocean engineering problem in which a ship traverses irregular wave episodes, and a dispersive wave turbulence system with extreme events, the Majda-McLaughlin-Tabak model.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2306.15159

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Diego County > Vista (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Supervised Pretraining Can Learn In-Context Reinforcement Learning

Lee, Jonathan N., Xie, Annie, Pacchiano, Aldo, Chandak, Yash, Finn, Chelsea, Nachum, Ofir, Brunskill, Emma

Large transformer models trained on diverse datasets have shown a remarkable ability to learn in-context, achieving high few-shot performance on tasks they were not explicitly trained to solve. In this paper, we study the in-context learning capabilities of transformers in decision-making problems, i.e., reinforcement learning (RL) for bandits and Markov decision processes. To do so, we introduce and study Decision-Pretrained Transformer (DPT), a supervised pretraining method where the transformer predicts an optimal action given a query state and an in-context dataset of interactions, across a diverse set of tasks. This procedure, while simple, produces a model with several surprising capabilities. We find that the pretrained transformer can be used to solve a range of RL problems in-context, exhibiting both exploration online and conservatism offline, despite not being explicitly trained to do so. The model also generalizes beyond the pretraining distribution to new tasks and automatically adapts its decision-making strategies to unknown structure. Theoretically, we show DPT can be viewed as an efficient implementation of Bayesian posterior sampling, a provably sample-efficient RL algorithm. We further leverage this connection to provide guarantees on the regret of the in-context algorithm yielded by DPT, and prove that it can learn faster than algorithms used to generate the pretraining data. These results suggest a promising yet simple path towards instilling strong in-context decision-making abilities in transformers.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2306.14892

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
(2 more...)

Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems

Deng, Jiajun, Li, Guinan, Xie, Xurong, Jin, Zengrui, Cui, Mingyu, Wang, Tianzi, Hu, Shujie, Geng, Mengzhe, Liu, Xunying

Rich sources of variability in natural speech present significant challenges to current data intensive speech recognition technologies. To model both speaker and environment level diversity, this paper proposes a novel Bayesian factorised speaker-environment adaptive training and test time adaptation approach for Conformer ASR models. Speaker and environment level characteristics are separately modeled using compact hidden output transforms, which are then linearly or hierarchically combined to represent any speaker-environment combination. Bayesian learning is further utilized to model the adaptation parameter uncertainty. Experiments on the 300-hr WHAM noise corrupted Switchboard data suggest that factorised adaptation consistently outperforms the baseline and speaker label only adapted Conformers by up to 3.1% absolute (10.4% relative) word error rate reductions. Further analysis shows the proposed method offers potential for rapid adaption to unseen speaker-environment conditions.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2306.14608

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Sarra, Leopoldo, Marquardt, Florian

Deep Bayesian Experimental Design for Quantum Many-Body Systems

Bayesian experimental design is a technique that allows to efficiently select measurements to characterize a physical system by maximizing the expected information gain. Recent developments in deep neural networks and normalizing flows allow for a more efficient approximation of the posterior and thus the extension of this technique to complex high-dimensional situations. In this paper, we show how this approach holds promise for adaptive measurement strategies to characterize present-day quantum technology platforms. In particular, we focus on arrays of coupled cavities and qubit arrays. Both represent model systems of high relevance for modern applications, like quantum simulations and computing, and both have been realized in platforms where measurement and control can be exploited to characterize and counteract unavoidable disorder. Thus, they represent ideal targets for applications of Bayesian experimental design.

artificial intelligence, experiment, machine learning, (15 more...)

2306.1451

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Inverse square Levy walk emerging universally in goal-oriented tasks

Shinohara, Shuji, Morita, Daiki, Manome, Nobuhito, Hayashi, Ryota, Moriyama, Toru, Okamoto, Hiroshi, Gunji, Pegio-Yukio, Chung, Ung-il

The Levy walk in which the frequency of occurrence of step lengths follows a power-law distribution, can be observed in the migratory behavior of organisms at various levels. Levy walks with power exponents close to 2 are observed, and the reasons are unclear. This study aims to propose a model that universally generates inverse square Levy walks (called Cauchy walks) and to identify the conditions under which Cauchy walks appear. We demonstrate that Cauchy walks emerge universally in goal-oriented tasks. We use the term "goal-oriented" when the goal is clear, but this can be achieved in different ways, which cannot be uniquely determined. We performed a simulation in which an agent observed the data generated from a probability distribution in a two-dimensional space and successively estimated the central coordinates of that probability distribution. The agent has a model of probability distribution as a hypothesis for data-generating distribution and can modify the model such that each time a data point is observed, thereby increasing the estimated probability of occurrence of the observed data. To achieve this, the center coordinates of the model must be moved closer to those of the observed data. However, in the case of a two-dimensional space, arbitrariness arises in the direction of correction of the center; this task is goal oriented. We analyze two cases: a strategy that allocates the amount of modification randomly in the x- and y-directions, and a strategy that determines allocation such that movement is minimized. The results reveal that when a random strategy is used, the Cauchy walk appears. When the minimum strategy is used, the Brownian walk appears. The presence or absence of the constraint of minimizing the amount of movement may be a factor that causes the difference between Brownian and Levy walks.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2305.15559

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Rodemann, Julian, Goschenhofer, Jann, Dorigatti, Emilio, Nagler, Thomas, Augustin, Thomas

Approximately Bayes-Optimal Pseudo Label Selection

Semi-supervised learning by self-training heavily relies on pseudo-label selection (PLS). The selection often depends on the initial model fit on labeled data. Early overfitting might thus be propagated to the final model by selecting instances with overconfident but erroneous predictions, often referred to as confirmation bias. This paper introduces BPLS, a Bayesian framework for PLS that aims to mitigate this issue. At its core lies a criterion for selecting instances to label: an analytical approximation of the posterior predictive of pseudo-samples. We derive this selection criterion by proving Bayes optimality of the posterior predictive of pseudo-samples. We further overcome computational hurdles by approximating the criterion analytically. Its relation to the marginal likelihood allows us to come up with an approximation based on Laplace's method and the Gaussian integral. We empirically assess BPLS for parametric generalized linear and non-parametric generalized additive models on simulated and real-world data. When faced with high-dimensional data prone to overfitting, BPLS outperforms traditional PLS methods.

artificial intelligence, likelihood, machine learning, (17 more...)

2302.08883

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)