AITopics

2405.02366

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Mexico > Puebla > San Andrés Cholula (0.04)
North America > Mexico > Puebla > Puebla (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

arXiv.org Machine LearningMay-3-2024

A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning

Dewolf, Nicolas

In the past decades, most work in the area of data analysis and machine learning was focused on optimizing predictive models and getting better results than what was possible with existing models. To what extent the metrics with which such improvements were measured were accurately capturing the intended goal, whether the numerical differences in the resulting values were significant, or whether uncertainty played a role in this study and if it should have been taken into account, was of secondary importance. Whereas probability theory, be it frequentist or Bayesian, used to be the gold standard in science before the advent of the supercomputer, it was quickly replaced in favor of black box models and sheer computing power because of their ability to handle large data sets. This evolution sadly happened at the expense of interpretability and trustworthiness. However, while people are still trying to improve the predictive power of their models, the community is starting to realize that for many applications it is not so much the exact prediction that is of importance, but rather the variability or uncertainty. The work in this dissertation tries to further the quest for a world where everyone is aware of uncertainty, of how important it is and how to embrace it instead of fearing it. A specific, though general, framework that allows anyone to obtain accurate uncertainty estimates is singled out and analysed. Certain aspects and applications of the framework -- dubbed `conformal prediction' -- are studied in detail. Whereas many approaches to uncertainty quantification make strong assumptions about the data, conformal prediction is, at the time of writing, the only framework that deserves the title `distribution-free'. No parametric assumptions have to be made and the nonparametric results also hold without having to resort to the law of large numbers in the asymptotic regime.

clusterwise average prediction, clusterwise representation complexity, conditional nonconformity distribution, (16 more...)

2405.02082

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
Asia > Middle East > Jordan (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Health & Medicine (1.00)
Education > Educational Setting (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(6 more...)

Lillelund, Christian Marius, Pannullo, Fernando, Jakobsen, Morten Opprud, Morante, Manuel, Pedersen, Christian Fischer

A probabilistic estimation of remaining useful life from censored time-to-event data

arXiv.org Artificial IntelligenceMay-2-2024

Predicting the remaining useful life (RUL) of ball bearings plays an important role in predictive maintenance. A common definition of the RUL is the time until a bearing is no longer functional, which we denote as an event, and many data-driven methods have been proposed to predict the RUL. However, few studies have addressed the problem of censored data, where this event of interest is not observed, and simply ignoring these observations can lead to an overestimation of the failure risk. In this paper, we propose a probabilistic estimation of RUL using survival analysis that supports censored data. First, we analyze sensor readings from ball bearings in the frequency domain and annotate when a bearing starts to deteriorate by calculating the Kullback-Leibler (KL) divergence between the probability density function (PDF) of the current process and a reference PDF. Second, we train several survival models on the annotated bearing dataset, capable of predicting the RUL over a finite time horizon using the survival function. This function is guaranteed to be strictly monotonically decreasing and is an intuitive estimation of the remaining lifetime. We demonstrate our approach in the XJTU-SY dataset using cross-validation and find that Random Survival Forests consistently outperforms both non-neural networks and neural networks in terms of the mean absolute error (MAE). Our work encourages the inclusion of censored data in predictive maintenance models and highlights the unique advantages that survival analysis offers when it comes to probabilistic RUL estimation and early fault detection.

bearing, dataset, prediction, (17 more...)

2405.01614

Country:

Europe > Denmark (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine (1.00)
Energy (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Waxman, Daniel, Djurić, Petar M.

Dynamic Online Ensembles of Basis Expansions

arXiv.org Machine LearningMay-2-2024

Practical Bayesian learning often requires (1) online inference, (2) dynamic models, and (3) ensembling over multiple different models. Recent advances have shown how to use random feature approximations to achieve scalable, online ensembling of Gaussian processes with desirable theoretical properties and fruitful applications. One key to these methods' success is the inclusion of a random walk on the model parameters, which makes models dynamic. We show that these methods can be generalized easily to any basis expansion model and that using alternative basis expansions, such as Hilbert space Gaussian processes, often results in better performance. To simplify the process of choosing a specific basis expansion, our method's generality also allows the ensembling of several entirely different models, for example, a Gaussian process and polynomial regression. Finally, we propose a novel method to ensemble static and dynamic models together.

dataset, machine learning research, transaction, (14 more...)

2405.01365

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Ziou, Djemel, Fakir, Issam

Deriving Lehmer and H\"older means as maximum weighted likelihood estimates for the multivariate exponential family

Consider numerical observations; it is common to calculate their mean and refer to it as central tendency. There are, however, different measures of mean [4]. These measurements are sometimes grouped into families, like Lehmer and Hölder. Distinguishing these measures and better understanding their use involves identifying the link between them and probability density functions (PDFs). For example, the arithmetic mean is the maximum likelihood estimator (MLE) of the position parameter for the normal PDF and the scale parameter for the exponential PDF. For the families of Lehmer and Hölder means, such an interpretation has only recently been proposed for the case of PDFs in the case of the univariate exponential family Let's consider digital observations; it is often common to calculate their mean and designate it as a central tendency. However, there are various measures of the average [2]. These measures are sometimes grouped into families, such as Lehmer and Hölder.

central tendency, exponential family, multivariate exponential family, (12 more...)

2405.00964

Country:

North America > United States (0.47)
North America > Canada > Quebec > Estrie Region > Sherbrooke (0.04)
North America > Canada > British Columbia (0.04)

Genre: Research Report (0.50)

Industry:

Government (0.48)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Austin, David Eric, Korikov, Anton, Toroghi, Armin, Sanner, Scott

Bayesian Optimization with LLM-Based Acquisition Functions for Natural Language Preference Elicitation

Designing preference elicitation (PE) methodologies that can quickly ascertain a user's top item preferences in a cold-start setting is a key challenge for building effective and personalized conversational recommendation (ConvRec) systems. While large language models (LLMs) constitute a novel technology that enables fully natural language (NL) PE dialogues, we hypothesize that monolithic LLM NL-PE approaches lack the multi-turn, decision-theoretic reasoning required to effectively balance the NL exploration and exploitation of user preferences towards an arbitrary item set. In contrast, traditional Bayesian optimization PE methods define theoretically optimal PE strategies, but fail to use NL item descriptions or generate NL queries, unrealistically assuming users can express preferences with direct item ratings and comparisons. To overcome the limitations of both approaches, we formulate NL-PE in a Bayesian Optimization (BO) framework that seeks to generate NL queries which actively elicit natural language feedback to reduce uncertainty over item utilities to identify the best recommendation. We demonstrate our framework in a novel NL-PE algorithm, PEBOL, which uses Natural Language Inference (NLI) between user preference utterances and NL item descriptions to maintain preference beliefs and BO strategies such as Thompson Sampling (TS) and Upper Confidence Bound (UCB) to guide LLM query generation. We numerically evaluate our methods in controlled experiments, finding that PEBOL achieves up to 131% improvement in MAP@10 after 10 turns of cold start NL-PE dialogue compared to monolithic GPT-3.5, despite relying on a much smaller 400M parameter NLI model for preference inference.

bayesian optimization, item description, query, (12 more...)

2405.00981

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.05)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Grand, Gabriel, Pepe, Valerio, Andreas, Jacob, Tenenbaum, Joshua B.

Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling

Questions combine our mastery of language with our remarkable facility for reasoning about uncertainty. How do people navigate vast hypothesis spaces to pose informative questions given limited cognitive resources? We study these tradeoffs in a classic grounded question-asking task based on the board game Battleship. Our language-informed program sampling (LIPS) model uses large language models (LLMs) to generate natural language questions, translate them into symbolic programs, and evaluate their expected information gain. We find that with a surprisingly modest resource budget, this simple Monte Carlo optimization strategy yields informative questions that mirror human performance across varied Battleship board scenarios. In contrast, LLM-only baselines struggle to ground questions in the board state; notably, GPT-4V provides no improvement over non-visual baselines. Our results illustrate how Bayesian models of question-asking can leverage the statistics of language to capture human priors, while highlighting some shortcomings of pure LLMs as grounded reasoners.

red ship, ship, topleft, (16 more...)

2402.19471

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Government > Military > Navy (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Zhuang, Jun, Cunningham, Jack, Guan, Chaowen

Improving Trainability of Variational Quantum Circuits via Regularization Strategies

In the era of noisy intermediate-scale quantum (NISQ), variational quantum circuits (VQCs) have been widely applied in various domains, advancing the superiority of quantum circuits against classic models. Similar to classic models, regular VQCs can be optimized by various gradient-based methods. However, the optimization may be initially trapped in barren plateaus or eventually entangled in saddle points during training. These gradient issues can significantly undermine the trainability of VQC. In this work, we propose a strategy that regularizes model parameters with prior knowledge of the train data and Gaussian noise diffusion. We conduct ablation studies to verify the effectiveness of our strategy across four public datasets and demonstrate that our method can improve the trainability of VQCs against the above-mentioned gradient issues.

dr max, glorot-normal, model parameter, (16 more...)

2405.01606

Country:

North America > United States > Ohio > Hamilton County > Cincinnati (0.04)
North America > United States > Idaho > Ada County > Boise (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Machine LearningMay-1-2024

Variational Bayesian Methods for a Tree-Structured Stick-Breaking Process Mixture of Gaussians

Nakahara, Yuta

The Bayes coding algorithm for context tree source is a successful example of Bayesian tree estimation in text compression in information theory. This algorithm provides an efficient parametric representation of the posterior tree distribution and exact updating of its parameters. We apply this algorithm to a clustering task in machine learning. More specifically, we apply it to Bayesian estimation of the tree-structured stick-breaking process (TS-SBP) mixture models. For TS-SBP mixture models, only Markov chain Monte Carlo methods have been proposed so far, but any variational Bayesian methods have not been proposed yet. In this paper, we propose a variational Bayesian method that has a subroutine similar to the Bayes coding algorithm for context tree sources. We confirm its behavior by a numerical experiment on a toy example.

denote, node, probability distribution, (16 more...)

2405.00385

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Landry, Nicholas W., Thompson, William, Hébert-Dufresne, Laurent, Young, Jean-Gabriel

Complex contagions can outperform simple contagions for network reconstruction with dense networks or saturated dynamics

arXiv.org Machine LearningApr-30-2024

Network scientists often use complex dynamic processes to describe network contagions, but tools for fitting contagion models typically assume simple dynamics. Here, we address this gap by developing a nonparametric method to reconstruct a network and dynamics from a series of node states, using a model that breaks the dichotomy between simple pairwise and complex neighborhood-based contagions. We then show that a network is more easily reconstructed when observed through the lens of complex contagions if it is dense or the dynamic saturates, and that simple contagions are better otherwise.

complex contagion, contagion, network reconstruction, (13 more...)

2405.00129

Country: North America > United States > Vermont > Chittenden County > Burlington (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)