AITopics

2304.07665

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Qu, Zhaonan, Galichon, Alfred, Ugander, Johan

On Sinkhorn's Algorithm and Choice Modeling

arXiv.org Artificial IntelligenceSep-30-2023

For a broad class of choice and ranking models based on Luce's choice axiom, including the Bradley--Terry--Luce and Plackett--Luce models, we show that the associated maximum likelihood estimation problems are equivalent to a classic matrix balancing problem with target row and column sums. This perspective opens doors between two seemingly unrelated research areas, and allows us to unify existing algorithms in the choice modeling literature as special instances or analogs of Sinkhorn's celebrated algorithm for matrix balancing. We draw inspirations from these connections and resolve important open problems on the study of Sinkhorn's algorithm. We first prove the global linear convergence of Sinkhorn's algorithm for non-negative matrices whenever finite solutions to the matrix balancing problem exist. We characterize this global rate of convergence in terms of the algebraic connectivity of the bipartite graph constructed from data. Next, we also derive the sharp asymptotic rate of linear convergence, which generalizes a classic result of Knight (2008), but with a more explicit analysis that exploits an intrinsic orthogonality structure. To our knowledge, these are the first quantitative linear convergence results for Sinkhorn's algorithm for general non-negative matrices and positive marginals. The connections we establish in this paper between matrix balancing and choice modeling could help motivate further transmission of ideas and interesting results in both directions.

algorithm, matrix, sinkhorn, (14 more...)

arXiv.org Artificial Intelligence

2310.0026

Country:

Europe > Ireland (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry:

Transportation (0.67)
Government (0.67)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Schlichting, Marc R., Boord, Nina V., Corso, Anthony L., Kochenderfer, Mykel J.

SAVME: Efficient Safety Validation for Autonomous Systems Using Meta-Learning

arXiv.org Artificial IntelligenceSep-30-2023

Discovering potential failures of an autonomous system is important prior to deployment. Falsification-based methods are often used to assess the safety of such systems, but the cost of running many accurate simulation can be high. The validation can be accelerated by identifying critical failure scenarios for the system under test and by reducing the simulation runtime. We propose a Bayesian approach that integrates meta-learning strategies with a multi-armed bandit framework. Our method involves learning distributions over scenario parameters that are prone to triggering failures in the system under test, as well as a distribution over fidelity settings that enable fast and accurate simulations. In the spirit of meta-learning, we also assess whether the learned fidelity settings distribution facilitates faster learning of the scenario parameter distributions for new scenarios. We showcase our methodology using a cutting-edge 3D driving simulator, incorporating 16 fidelity settings for an autonomous vehicle stack that includes camera and lidar sensors. We evaluate various scenarios based on an autonomous vehicle pre-crash typology. As a result, our approach achieves a significant speedup, up to 18 times faster compared to traditional methods that solely rely on a high-fidelity simulator.

fidelity, scenario, simulator, (16 more...)

arXiv.org Artificial Intelligence

2309.12474

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (0.46)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.61)
(2 more...)

Lin, Alexander, Ba, Demba

An Efficient Algorithm for Clustered Multi-Task Compressive Sensing

arXiv.org Machine LearningSep-30-2023

This paper considers clustered multi-task compressive sensing, a hierarchical model that solves multiple compressive sensing tasks by finding clusters of tasks that leverage shared information to mutually improve signal reconstruction. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. The main bottleneck involves repeated matrix inversion and log-determinant computation for multiple large covariance matrices. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices. Our approach combines Monte Carlo sampling with iterative linear solvers. Our experiments reveal that compared to the existing baseline, our algorithm can be up to thousands of times faster and an order of magnitude more memory-efficient.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2310.0042

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Gradient and Uncertainty Enhanced Sequential Sampling for Global Fit

Lämmle, Sven, Bogoclu, Can, Cremanns, Kevin, Roos, Dirk

Most of these applications require computationally demanding simulation models. Machine learning (ML) models have emerged to mitigate the computational cost and are used as surrogates. For this task, ML models learn the relationship between the response of the expensive simulations from a dataset containing past observations. Various ML methods were proposed as surrogate models (sometimes referred as response surface methods), e.g. as solvers for partial differential equations (PDEs) describing the behavior of high-dimensional vector spaces or fields [1-3], or probabilistic approximators of parametric response functions [4]. Among these types of surrogate models, Gaussian Process Regression (GP) [5] is a popular choice [6, 7] because of its flexibility and ability to predict the model uncertainty. However, drawbacks are the selection of a suitable covariance (kernel) function that is application dependent, and the computational burden for larger data sets. Various extensions to GPs such as the deep GPs[8], sparse GPs based on variational inference [9, 10], and efficient matrix decomposition [11] were developed to overcome some of these limitations. One particularly promising approach is the combination of GPs with Artificial Neural Networks (ANNs) to Deep Gaussian Covariance Networks (DGCNs) [12, 13] to learn the non-stationary hyperparameters of the GP together with combinations of different covariance functions.

artificial intelligence, machine learning, surrogate model, (17 more...)

doi: 10.1016/j.cma.2023.116226

2310.0011

Country:

Europe > Germany (0.28)
Asia > Middle East > Israel > Mediterranean Sea (0.24)
North America > United States > New York (0.14)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Artificial IntelligenceSep-29-2023

Human-like Few-Shot Learning via Bayesian Reasoning over Natural Language

Ellis, Kevin

A core tension in models of concept learning is that the model must carefully balance the tractability of inference against the expressivity of the hypothesis class. Humans, however, can efficiently learn a broad range of concepts. We introduce a model of inductive learning that seeks to be human-like in that sense. It implements a Bayesian reasoning process where a language model first proposes candidate hypotheses expressed in natural language, which are then re-weighed by a prior and a likelihood. By estimating the prior from human data, we can predict human judgments on learning problems involving numbers and sets, spanning concepts that are generative, discriminative, propositional, and higher-order.

hypothesis, rectangle, triangle, (16 more...)

arXiv.org Artificial Intelligence

2306.02797

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Statistical physics, Bayesian inference and neural information processing

Grant, Erin, Nestler, Sandra, Şimşek, Berfin, Solla, Sara

Lecture notes from the course given by Professor Sara A. Solla at the Les Houches summer school on "Statistical physics of Machine Learning". The notes discuss neural information processing through the lens of Statistical Physics. Contents include Bayesian inference and its connection to a Gibbs description of learning and generalization, Generalized Linear Models as a controlled alternative to backpropagation through time, and linear and non-linear techniques for dimensionality reduction.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2309.17006

Country:

North America > United States > Utah (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression

Zhai, Runtian, Liu, Bingbin, Risteski, Andrej, Kolter, Zico, Ravikumar, Pradeep

Data augmentation is critical to the empirical success of modern self-supervised representation learning, such as contrastive learning and masked language modeling. However, a theoretical understanding of the exact role of augmentation remains limited. Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator, suggesting that learning a linear probe atop such representation can be connected to RKHS regression. Building on this insight, this work delves into a statistical analysis of augmentation-based pretraining. Starting from the isometry property, a geometric characterization of the target function given by the augmentation, we disentangle the effects of the model and the augmentation, and prove two generalization bounds that are free of model complexity. Our first bound works for an arbitrary encoder, where the prediction error is decomposed as the sum of an estimation error incurred by fitting a linear probe with RKHS regression, and an approximation error entailed by RKHS approximation. Our second bound specifically addresses the case where the encoder is near-optimal, that is it approximates the top-d eigenspace of the RKHS induced by the augmentation. A key ingredient in our analysis is the augmentation complexity, which we use to quantitatively compare different augmentations and analyze their impact on downstream performance. It is widely acknowledged that better data augmentation techniques have been a major driving force in many recent breakthroughs in self-supervised representation learning. For example, contrastive learning (Chen et al., 2020) with aggressive random cropping and strong color distortion greatly improves the performance of vision tasks, and masked prediction with random masking and replacement is among the state-of-the-art representation learning methods for both natural language processing (Devlin et al., 2019) and vision (He et al., 2022). Due to their extraordinary empirical successes, developing a rigorous theoretical understanding of augmentation-based representation learning is an important open problem, and is crucial for inventing new, better and principled augmentation techniques. A key advance was recently made by HaoChen et al. (2021), who analyzed a variant of contrastive representation learning, termed spectral contrastive learning, by connecting it to learning eigenfunctions of a Laplacian operator over a population augmentation graph. This has been followed up by multiple works which extended these results to more general contrastive learning approaches (Saunshi et al., 2022; Johnson et al., 2023; Cabannes et al., 2023).

artificial intelligence, machine learning, natural language, (18 more...)

2306.00788

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Virginia (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)

Provable Offline Preference-Based Reinforcement Learning

Zhan, Wenhao, Uehara, Masatoshi, Kallus, Nathan, Lee, Jason D., Sun, Wen

In this paper, we investigate the problem of offline Preference-based Reinforcement Learning (PbRL) with human feedback where feedback is available in the form of preference between trajectory pairs rather than explicit rewards. Our proposed algorithm consists of two main steps: (1) estimate the implicit reward using Maximum Likelihood Estimation (MLE) with general function approximation from offline data and (2) solve a distributionally robust planning problem over a confidence set around the MLE. We consider the general reward setting where the reward can be defined over the whole trajectory and provide a novel guarantee that allows us to learn any target policy with a polynomial number of samples, as long as the target policy is covered by the offline data. This guarantee is the first of its kind with general function approximation. To measure the coverage of the target policy, we introduce a new single-policy concentrability coefficient, which can be upper bounded by the per-trajectory concentrability coefficient. We also establish lower bounds that highlight the necessity of such concentrability and the difference from standard RL, where state-action-wise rewards are directly observed. We further extend and analyze our algorithm when the feedback is given over action pairs.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2305.14816

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Workflow (0.66)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)

A Variational Perspective on Solving Inverse Problems with Diffusion Models

Mardani, Morteza, Song, Jiaming, Kautz, Jan, Vahdat, Arash

Diffusion models have emerged as a key pillar of foundation models in visual domains. One of their critical applications is to universally solve different downstream inverse tasks via a single diffusion prior without re-training for each task. Most inverse tasks can be formulated as inferring a posterior distribution over data (e.g., a full image) given a measurement (e.g., a masked image). This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable. To cope with this challenge, we propose a variational approach that by design seeks to approximate the true posterior distribution. We show that our approach naturally leads to regularization by denoising diffusion process (RED-Diff) where denoisers at different timesteps concurrently impose different structural constraints over the image. To gauge the contribution of denoisers from different timesteps, we propose a weighting mechanism based on signal-to-noise-ratio (SNR). Our approach provides a new variational perspective for solving inverse problems with diffusion models, allowing us to formulate sampling as stochastic optimization, where one can simply apply off-the-shelf solvers with lightweight iterates. Our experiments for image restoration tasks such as inpainting and superresolution demonstrate the strengths of our method compared with state-of-the-art sampling-based diffusion models.

artificial intelligence, inverse problem, machine learning, (19 more...)

2305.04391

Country: Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)