AITopics

2304.00917

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Panigrahi, Snigdha, Fry, Kevin, Taylor, Jonathan

Exact Selective Inference with Randomization

arXiv.org Machine LearningDec-22-2023

The polyhedral method by Lee et al. (2016) introduced confidence intervals for exact selective inference in Gaussian regression models. This method provides valid inferences for selected parameters by conditioning on the outcome of selection. A pivot is obtained for each selected parameter from a truncated Gaussian distribution, provided the outcome of selection can be described by linear constraints, also known as polyhedral constraints. However, as shown by Kivaranovic and Leeb (2021), confidence intervals based on this pivot can have infinite length in expectation. Randomizing data at the time of selection and conditioning on the outcome of randomized selection produces narrower confidence intervals than the polyhedral method.

artificial intelligence, bayesian inference, machine learning, (16 more...)

2212.1294

Country: North America > United States > Michigan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Rothfuss, Jonas, Josifoski, Martin, Fortuin, Vincent, Krause, Andreas

Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior: From Theory to Practice

arXiv.org Machine LearningDec-22-2023

Meta-Learning aims to speed up the learning process on new tasks by acquiring useful inductive biases from datasets of related learning tasks. While, in practice, the number of related tasks available is often small, most of the existing approaches assume an abundance of tasks; making them unrealistic and prone to overfitting. A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks. In this work, we provide a theoretical analysis using the PAC-Bayesian theory and present a generalization bound for meta-learning, which was first derived by Rothfuss et al. (2021a). Crucially, the bound allows us to derive the closed form of the optimal hyper-posterior, referred to as PACOH, which leads to the best performance guarantees. We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds. The closed-form PACOH inspires a practical meta-learning approach that avoids the reliance on bi-level optimization, giving rise to a stochastic optimization problem that is amenable to standard variational methods that scale well. Our experiments show that, when instantiating the PACOH with Gaussian processes and Bayesian Neural Networks models, the resulting methods are more scalable, and yield state-of-the-art performance, both in terms of predictive accuracy and the quality of uncertainty estimates.

artificial intelligence, machine learning, scalable pac-bayesian meta-learning, (12 more...)

2211.07206

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area (0.93)
Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sidiropoulos, Nicholas D., Karakasis, Paris, Konar, Aritra

Minimizing low-rank models of high-order tensors: Hardness, span, tight relaxation, and applications

We consider the problem of finding the smallest or largest entry of a tensor of order N that is specified via its rank decomposition. Stated in a different way, we are given N sets of R-dimensional vectors and we wish to select one vector from each set such that the sum of the Hadamard product of the selected vectors is minimized or maximized. We show that this fundamental tensor problem is NP-hard for any tensor rank higher than one, and polynomial-time solvable in the rank-one case. We also propose a continuous relaxation and prove that it is tight for any rank. For low-enough ranks, the proposed continuous reformulation is amenable to low-complexity gradient-based optimization, and we propose a suite of gradient-based optimization algorithms drawing from projected gradient descent, Frank-Wolfe, or explicit parametrization of the relaxed constraints. We also show that our core results remain valid no matter what kind of polyadic tensor model is used to represent the tensor of interest, including Tucker, HOSVD/MLSVD, tensor train, or tensor ring. Next, we consider the class of problems that can be posed as special instances of the problem of interest. We show that this class includes the partition problem (and thus all NP-complete problems via polynomial-time transformation), integer least squares, integer linear programming, integer quadratic programming, sign retrieval (a special kind of mixed integer programming / restricted version of phase retrieval), and maximum likelihood decoding of parity check codes. We demonstrate promising experimental results on a number of hard problems, including state-of-art performance in decoding low density parity check codes and general parity check codes.

artificial intelligence, machine learning, tensor, (19 more...)

doi: 10.1109/TSP.2023.3338062

2210.11413

Country:

Europe > Greece (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
(3 more...)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Abdulsamad, Hany, Iqbal, Sahel, Corenflos, Adrien, Särkkä, Simo

Risk-Sensitive Stochastic Optimal Control as Rao-Blackwellized Markovian Score Climbing

Stochastic optimal control of dynamical systems is a crucial challenge in sequential decision-making. Recently, control-as-inference approaches have had considerable success, providing a viable risk-sensitive framework to address the exploration-exploitation dilemma. Nonetheless, a majority of these techniques only invoke the inference-control duality to derive a modified risk objective that is then addressed within a reinforcement learning framework. This paper introduces a novel perspective by framing risk-sensitive stochastic control as Markovian score climbing under samples drawn from a conditional particle filter. Our approach, while purely inference-centric, provides asymptotically unbiased estimates for gradient-based policy optimization with optimal importance weighting and no explicit value function learning. To validate our methodology, we apply it to the task of learning neural non-Gaussian feedback policies, showcasing its efficacy on numerical benchmarks of stochastic dynamical systems.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2312.14

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation

Xia, Yu, Arian, Ali, Narayanamoorthy, Sriram, Mabry, Joshua

Significant research effort has been devoted in recent years to developing personalized pricing, promotions, and product recommendation algorithms that can leverage rich customer data to learn and earn. Systematic benchmarking and evaluation of these causal learning systems remains a critical challenge, due to the lack of suitable datasets and simulation environments. In this work, we propose a multi-stage model for simulating customer shopping behavior that captures important sources of heterogeneity, including price sensitivity and past experiences. We embedded this model into a working simulation environment -- RetailSynth. RetailSynth was carefully calibrated on publicly available grocery data to create realistic synthetic shopping transactions. Multiple pricing policies were implemented within the simulator and analyzed for impact on revenue, category penetration, and customer retention. Applied researchers can use RetailSynth to validate causal demand models for multi-category retail and to incorporate realistic price sensitivity into emerging benchmarking suites for personalized pricing, promotions, and product recommendations.

category, customer, discount, (15 more...)

2312.14095

Country:

Oceania > Australia > Western Australia > Perth (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (1.00)

Industry: Retail (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
(2 more...)

Zhang, Keming, Bloom, Joshua S., van der Walt, Stéfan, Hernitschek, Nina

nbi: the Astronomer's Package for Neural Posterior Estimation

Despite the promise of Neural Posterior Estimation (NPE) methods in astronomy, the adaptation of NPE into the routine inference workflow has been slow. We identify three critical issues: the need for custom featurizer networks tailored to the observed data, the inference inexactness, and the under-specification of physical forward models. To address the first two issues, we introduce a new framework and open-source software nbi (Neural Bayesian Inference), which supports both amortized and sequential NPE. First, nbi provides built-in "featurizer" networks with demonstrated efficacy on sequential data, such as light curve and spectra, thus obviating the need for this customization on the user end. Second, we introduce a modified algorithm SNPE-IS, which facilities asymptotically exact inference by using the surrogate posterior under NPE only as a proposal distribution for importance sampling. These features allow nbi to be applied off-the-shelf to astronomical inference problems involving light curves and spectra. We discuss how nbi may serve as an effective alternative to existing methods such as Nested Sampling. Our package is at https://github.com/kmzzhang/nbi.

effective sample size, posterior, surrogate posterior, (11 more...)

2312.03824

Country:

South America > Chile > Antofagasta Region > Antofagasta Province > Antofagasta (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

Peters, Lasse, Bajcsy, Andrea, Chiu, Chih-Yuan, Fridovich-Keil, David, Laine, Forrest, Ferranti, Laura, Alonso-Mora, Javier

Contingency Games for Multi-Agent Interaction

Contingency planning, wherein an agent generates a set of possible plans conditioned on the outcome of an uncertain event, is an increasingly popular way for robots to act under uncertainty. In this work we take a game-theoretic perspective on contingency planning, tailored to multi-agent scenarios in which a robot's actions impact the decisions of other agents and vice versa. The resulting contingency game allows the robot to efficiently interact with other agents by generating strategic motion plans conditioned on multiple possible intents for other actors in the scene. Contingency games are parameterized via a scalar variable which represents a future time when intent uncertainty will be resolved. By estimating this parameter online, we construct a game-theoretic motion planner that adapts to changing beliefs while anticipating future certainty. We show that existing variants of game-theoretic planning under uncertainty are readily obtained as special cases of contingency games. Through a series of simulated autonomous driving scenarios, we demonstrate that contingency games close the gap between certainty-equivalent games that commit to a single hypothesis and non-contingent multi-hypothesis games that do not account for future uncertainty reduction.

contingency game, hypothesis, robot, (15 more...)

2304.05483

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > Netherlands > South Holland > Delft (0.04)
North America > United States > Tennessee > Davidson County > Nashville (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Information Technology > Robotics & Automation (0.48)
Transportation > Ground > Road (0.48)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningDec-21-2023

Learned reconstruction methods for inverse problems: sample error estimates

Ratti, Luca

The mathematical treatment of inverse problems has proved to be a lively and attractive research field, driven and motivated by a wide variety of applications and by the theoretical challenges induced by their ill-posed nature. In order to provide more accurate and reliable strategies, especially for the reconstruction task, the scientific research in the field has shown a growing interest in the topic of learned reconstruction, or data-driven, methods, to combine classical, model-based approaches with valuable information of statistical nature. This has represented a natural outcome and development of the analysis of inverse problems, both on a numerical and on a theoretical side: indeed, the idea of leveraging prior knowledge on the solution has traditionally been considered to mitigate ill-posedness, as a regularization tool as much as a support for the reconstruction. We have now witnessed the emergence of several learning-based approaches to inverse problems, providing, in many cases, striking numerical results in terms of accuracy and efficiency. Moreover, significant interest has grown in the direction of theoretical guarantees for such techniques, spanning from the demand of interpretability and reliability, up to the issues of stability and convergence [8, 55]. Despite several distinct avenues have emerged, which are now well-established and are developing independently (to name a few: generative models, unrolled techniques, Plug and Play schemes), it is possible to provide a unifying overview of them from the point of view of statistical learning theory [20]. In this context, the goal pursued by this paper is twofold. On the one side, it aims to provide a general theoretical framework in statistical learning that is able to comprehend a large family of data-driven reconstruction methods.

inverse problem, random variable, regularization, (14 more...)

2312.14078

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)

Genre:

Overview (0.67)
Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Zhang, Liyi, McCoy, R. Thomas, Sumers, Theodore R., Zhu, Jian-Qiao, Griffiths, Thomas L.

Deep de Finetti: Recovering Topic Distributions from Large Language Models

arXiv.org Machine LearningDec-21-2023

Large language models (LLMs) can produce long, coherent passages of text, suggesting that LLMs, although trained on next-word prediction, must represent the latent structure that characterizes a document. Prior work has found that internal representations of LLMs encode one aspect of latent structure, namely syntax; here we investigate a complementary aspect, namely the document's topic structure. We motivate the hypothesis that LLMs capture topic structure by connecting LLM optimization to implicit Bayesian inference. De Finetti's theorem shows that exchangeable probability distributions can be represented as a mixture with respect to a latent generating distribution. Although text is not exchangeable at the level of syntax, exchangeability is a reasonable starting assumption for topic structure. We thus hypothesize that predicting the next token in text will lead LLMs to recover latent topic distributions. We examine this hypothesis using Latent Dirichlet Allocation (LDA), an exchangeable probabilistic topic model, as a target, and we show that the representations formed by LLMs encode both the topics used to generate synthetic data and those used to explain natural corpus data.

large language model, machine learning, natural language, (19 more...)

2312.14226

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Portugal > Lisbon > Lisbon (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)