AITopics

2408.15649

Country:

North America > United States (0.28)
Pacific Ocean (0.04)
Europe > Netherlands > Limburg > Maastricht (0.04)
(28 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Government (0.67)
Consumer Products & Services > Restaurants (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(2 more...)

Polson, Nick, Ruggeri, Fabrizio, Sokolov, Vadim

Generative Bayesian Computation for Maximum Expected Utility

arXiv.org Machine LearningAug-28-2024

Generative Bayesian Computation (GBC) methods are developed to provide an efficient computational solution for maximum expected utility (MEU). We propose a density-free generative method based on quantiles that naturally calculates expected utility as a marginal of quantiles. Our approach uses a deep quantile neural estimator to directly estimate distributional utilities. Generative methods assume only the ability to simulate from the model and parameters and as such are likelihood-free. A large training dataset is generated from parameters and output together with a base distribution. Our method a number of computational advantages primarily being density-free with an efficient estimator of expected utility. A link with the dual theory of expected utility and risk taking is also discussed. To illustrate our methodology, we solve an optimal portfolio allocation problem with Bayesian learning and a power utility (a.k.a. fractional Kelly criterion). Finally, we conclude with directions for future research.

generative method, learning, neural network, (13 more...)

2408.16101

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Spain > Galicia > Madrid (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningAug-28-2024

Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods

Hu, Xinyang, Zhang, Fengzhuo, Chen, Siyu, Yang, Zhuoran

Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems using pretrained large language models (LLMs). In this work, we analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity. To this end, we introduce a multi-step latent variable model that encapsulates the reasoning process, where the latent variable encodes the task information. Under this framework, we demonstrate that when the pretraining dataset is sufficiently large, the estimator formed by CoT prompting is equivalent to a Bayesian estimator. This estimator effectively solves the multi-step reasoning problem by aggregating a posterior distribution inferred from the demonstration examples in the prompt. Moreover, we prove that the statistical error of the CoT estimator can be decomposed into two main components: (i) a prompting error, which arises from inferring the true task using CoT prompts, and (ii) the statistical error of the pretrained LLM. We establish that, under appropriate assumptions, the prompting error decays exponentially to zero as the number of demonstrations increases. Additionally, we explicitly characterize the approximation and generalization errors of the pretrained LLM. Notably, we construct a transformer model that approximates the target distribution of the multi-step reasoning problem with an error that decreases exponentially in the number of transformer blocks. Our analysis extends to other variants of CoT, including Self-Consistent CoT, Tree-of-Thought, and Selection-Inference, offering a broad perspective on the efficacy of these methods. We also provide numerical experiments to validate the theoretical findings.

prompt cot, reasoning step, transformer block, (12 more...)

2408.14511

Country:

North America > United States > New York (0.04)
Asia > India > Maharashtra > Mumbai (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
(11 more...)

Genre:

Workflow (1.00)
Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

arXiv.org Artificial IntelligenceAug-28-2024

Negative Binomial Matrix Completion

Lu, Yu, Bui, Kevin, Marcia, Roummel F.

Matrix completion focuses on recovering missing or incomplete information in matrices. This problem arises in various applications, including image processing and network analysis. Previous research proposed Poisson matrix completion for count data with noise that follows a Poisson distribution, which assumes that the mean and variance are equal. Since overdispersed count data, whose variance is greater than the mean, is more likely to occur in realistic settings, we assume that the noise follows the negative binomial (NB) distribution, which can be more general than the Poisson distribution. In this paper, we introduce NB matrix completion by proposing a nuclear-norm regularized model that can be solved by proximal gradient descent. In our experiments, we demonstrate that the NB model outperforms Poisson matrix completion in various noise and missing data settings on real data.

completion, matrix completion, nb model, (15 more...)

2408.16113

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > California > Merced County > Merced (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Jeong, Woojin, Min, Seungki

Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits

arXiv.org Artificial IntelligenceAug-28-2024

We consider a Bayesian budgeted multi-armed bandit problem, in which each arm consumes a different amount of resources when selected and there is a budget constraint on the total amount of resources that can be used. Budgeted Thompson Sampling (BTS) offers a very effective heuristic to this problem, but its arm-selection rule does not take into account the remaining budget information. We adopt \textit{Information Relaxation Sampling} framework that generalizes Thompson Sampling for classical $K$-armed bandit problems, and propose a series of algorithms that are randomized like BTS but more carefully optimize their decisions with respect to the budget constraint. In a one-to-one correspondence with these algorithms, a series of performance benchmarks that improve the conventional benchmark are also suggested. Our theoretical analysis and simulation results show that our algorithms (and our benchmarks) make incremental improvements over BTS (respectively, the conventional benchmark) across various settings including a real-world example.

conference paper, inner problem, irs, (15 more...)

2408.15535

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Marketing (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Chatterjee, Anirban, Niu, Ziang, Bhattacharya, Bhaswar B.

A Kernel-Based Conditional Two-Sample Test Using Nearest Neighbors (with Applications to Calibration, Regression Curves, and Simulation-Based Inference)

arXiv.org Machine LearningAug-28-2024

In this paper we introduce a kernel-based measure for detecting differences between two conditional distributions. Using the `kernel trick' and nearest-neighbor graphs, we propose a consistent estimate of this measure which can be computed in nearly linear time (for a fixed number of nearest neighbors). Moreover, when the two conditional distributions are the same, the estimate has a Gaussian limit and its asymptotic variance has a simple form that can be easily estimated from the data. The resulting test attains precise asymptotic level and is universally consistent for detecting differences between two conditional distributions. We also provide a resampling based test using our estimate that applies to the conditional goodness-of-fit problem, which controls Type I error in finite samples and is asymptotically consistent with only a finite number of resamples. A method to de-randomize the resampling test is also presented. The proposed methods can be readily applied to a broad range of problems, ranging from classical nonparametric statistics to modern machine learning. Specifically, we explore three applications: testing model calibration, regression curve evaluation, and validation of emulator models in simulation-based inference. We illustrate the superior performance of our method for these tasks, both in simulations as well as on real data. In particular, we apply our method to (1) assess the calibration of neural network models trained on the CIFAR-10 dataset, (2) compare regression functions for wind power generation across two different turbines, and (3) validate emulator models on benchmark examples with intractable posteriors and for generating synthetic `redshift' associated with galaxy images.

calibration, conditional distribution, ecmmd test, (16 more...)

2407.1655

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
(6 more...)

Genre: Research Report > New Finding (0.92)

Industry: Energy > Renewable > Wind (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Silander, Tomi, Leppä-aho, Janne, Jääsaari, Elias, Roos, Teemu

Quotient Normalized Maximum Likelihood Criterion for Learning Bayesian Network Structures

arXiv.org Artificial IntelligenceAug-27-2024

We introduce an information theoretic criterion for Bayesian network structure learning which we call quotient normalized maximum likelihood (qNML). In contrast to the closely related factorized normalized maximum likelihood criterion, qNML satisfies the property of score equivalence. It is also decomposable and completely free of adjustable hyperparameters. For practical computations, we identify a remarkably accurate approximation proposed earlier by Szpankowski and Weinberger. Experiments on both simulated and real data demonstrate that the new criterion leads to parsimonious models with good predictive accuracy.

criterion, logp, qnml, (11 more...)

2408.14935

Country:

Europe > Finland > Uusimaa > Helsinki (0.04)
Asia (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Wakayama, Tomoya, Sugasawa, Shonosuke

Ensemble Prediction via Covariate-dependent Stacking

arXiv.org Machine LearningAug-27-2024

This study proposes a novel approach to ensemble prediction, called ``covariate-dependent stacking'' (CDST). Unlike traditional stacking methods, CDST allows model weights to vary flexibly as a function of covariates, thereby enhancing predictive performance in complex scenarios. We formulate the covariate-dependent weights through combinations of basis functions, estimate them by optimizing cross-validation, and develop an expectation-maximization algorithm, ensuring computational efficiency. To analyze the theoretical properties, we establish an oracle inequality regarding the expected loss to be minimized for estimating model weights. Through comprehensive simulation studies and an application to large-scale land price prediction, we demonstrate that the CDST consistently outperforms conventional model averaging methods, particularly on datasets where some models fail to capture the underlying complexity. Our findings suggest that the CDST is especially valuable for, but not limited to, spatio-temporal prediction problems, offering a powerful tool for researchers and practitioners in various data analysis fields.

covariate, prediction, regression, (17 more...)

2408.09755

Country:

Europe > Austria > Vienna (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
(2 more...)

arXiv.org Artificial IntelligenceAug-27-2024

Points2Plans: From Point Clouds to Long-Horizon Plans with Composable Relational Dynamics

Huang, Yixuan, Agia, Christopher, Wu, Jimmy, Hermans, Tucker, Bohg, Jeannette

We present Points2Plans, a framework for composable planning with a relational dynamics model that enables robots to solve long-horizon manipulation tasks from partial-view point clouds. Given a language instruction and a point cloud of the scene, our framework initiates a hierarchical planning procedure, whereby a language model generates a high-level plan and a sampling-based planner produces constraint-satisfying continuous parameters for manipulation primitives sequenced according to the high-level plan. Key to our approach is the use of a relational dynamics model as a unifying interface between the continuous and symbolic representations of states and actions, thus facilitating language-driven planning from high-dimensional perceptual input such as point clouds. Whereas previous relational dynamics models require training on datasets of multi-step manipulation scenarios that align with the intended test scenarios, Points2Plans uses only single-step simulated training data while generalizing zero-shot to a variable number of steps during real-world evaluations. We evaluate our approach on tasks involving geometric reasoning, multi-object interactions, and occluded object reasoning in both simulated and real-world settings. Results demonstrate that Points2Plans offers strong generalization to unseen long-horizon tasks in the real world, where it solves over 85% of evaluated tasks while the next best baseline solves only 50%. Qualitative demonstrations of our approach operating on a mobile manipulator platform are made available at sites.google.com/stanford.edu/points2plans.

international conference, points2plan, predicate, (15 more...)

2408.14769

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.24)
North America > United States > Utah (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(5 more...)

arXiv.org Artificial IntelligenceAug-27-2024

Estimating Causal Effects from Learned Causal Networks

Raichev, Anna, Ihler, Alexander, Tian, Jin, Dechter, Rina

The standard approach to answering an identifiable causal-effect query (e.g., $P(Y|do(X)$) when given a causal diagram and observational data is to first generate an estimand, or probabilistic expression over the observable variables, which is then evaluated using the observational data. In this paper, we propose an alternative paradigm for answering causal-effect queries over discrete observable variables. We propose to instead learn the causal Bayesian network and its confounding latent variables directly from the observational data. Then, efficient probabilistic graphical model (PGM) algorithms can be applied to the learned model to answer queries. Perhaps surprisingly, we show that this \emph{model completion} learning approach can be more effective than estimand approaches, particularly for larger models in which the estimand expressions become computationally difficult. We illustrate our method's potential using a benchmark collection of Bayesian networks and synthetically generated causal models.

algorithm, em4ci, query, (16 more...)

2408.14101

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Iowa (0.04)
North America > United States > California > Orange County > Irvine (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)