AITopics

2405.01365

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

arXiv.org Machine LearningMay-2-2024

Misclassification bounds for PAC-Bayesian sparse deep learning

Mai, The Tien

Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in deep learning for classification. This study presents an attempt to bridge that gap. By leveraging PAC-Bayes bounds techniques, we present theoretical results on the prediction or misclassification error of a probabilistic approach utilizing Spike-and-Slab priors for sparse deep learning in classification. We establish non-asymptotic results for the prediction error. Additionally, we demonstrate that, by considering different architectures, our results can achieve minimax optimal rates in both low and high-dimensional settings, up to a logarithmic factor. Moreover, our additional logarithmic term yields slight improvements over previous works. Additionally, we propose and analyze an automated model selection approach aimed at optimally choosing a network architecture with guaranteed optimality.

deep learning, learning, neural network, (17 more...)

2405.01304

Country:

North America > United States > New York (0.04)
Europe > France (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Ziou, Djemel, Fakir, Issam

Deriving Lehmer and H\"older means as maximum weighted likelihood estimates for the multivariate exponential family

Consider numerical observations; it is common to calculate their mean and refer to it as central tendency. There are, however, different measures of mean [4]. These measurements are sometimes grouped into families, like Lehmer and Hölder. Distinguishing these measures and better understanding their use involves identifying the link between them and probability density functions (PDFs). For example, the arithmetic mean is the maximum likelihood estimator (MLE) of the position parameter for the normal PDF and the scale parameter for the exponential PDF. For the families of Lehmer and Hölder means, such an interpretation has only recently been proposed for the case of PDFs in the case of the univariate exponential family Let's consider digital observations; it is often common to calculate their mean and designate it as a central tendency. However, there are various measures of the average [2]. These measures are sometimes grouped into families, such as Lehmer and Hölder.

central tendency, exponential family, multivariate exponential family, (12 more...)

2405.00964

Country:

North America > United States (0.47)
North America > Canada > Quebec > Estrie Region > Sherbrooke (0.04)
North America > Canada > British Columbia (0.04)

Genre: Research Report (0.50)

Industry:

Government (0.48)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Hussien, Mohamed Manzour, Melo, Angie Nataly, Ballardini, Augusto Luis, Maldonado, Carlota Salinas, Izquierdo, Rubén, Sotelo, Miguel Ángel

RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models

Prediction of road users' behaviors in the context of autonomous driving has gained considerable attention by the scientific community in the last years. Most works focus on predicting behaviors based on kinematic information alone, a simplification of the reality since road users are humans, and as such they are highly influenced by their surrounding context. In addition, a large plethora of research works rely on powerful Deep Learning techniques, which exhibit high performance metrics in prediction tasks but may lack the ability to fully understand and exploit the contextual semantic information contained in the road scene, not to mention their inability to provide explainable predictions that can be understood by humans. In this work, we propose an explainable road users' behavior prediction system that integrates the reasoning abilities of Knowledge Graphs (KG) and the expressiveness capabilities of Large Language Models (LLM) by using Retrieval Augmented Generation (RAG) techniques. For that purpose, Knowledge Graph Embeddings (KGE) and Bayesian inference are combined to allow the deployment of a fully inductive reasoning system that enables the issuing of predictions that rely on legacy information contained in the graph as well as on current evidence gathered in real time by onboard sensors. Two use cases have been implemented following the proposed approach: 1) Prediction of pedestrians' crossing actions; 2) Prediction of lane change maneuvers. In both cases, the performance attained surpasses the current state of the art in terms of anticipation and F1-score, showing a promising avenue for future research in this field.

dataset, prediction, vehicle, (14 more...)

2405.00449

Country:

Europe > Spain > Galicia > Madrid (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
Europe > Germany (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Grand, Gabriel, Pepe, Valerio, Andreas, Jacob, Tenenbaum, Joshua B.

Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling

Questions combine our mastery of language with our remarkable facility for reasoning about uncertainty. How do people navigate vast hypothesis spaces to pose informative questions given limited cognitive resources? We study these tradeoffs in a classic grounded question-asking task based on the board game Battleship. Our language-informed program sampling (LIPS) model uses large language models (LLMs) to generate natural language questions, translate them into symbolic programs, and evaluate their expected information gain. We find that with a surprisingly modest resource budget, this simple Monte Carlo optimization strategy yields informative questions that mirror human performance across varied Battleship board scenarios. In contrast, LLM-only baselines struggle to ground questions in the board state; notably, GPT-4V provides no improvement over non-visual baselines. Our results illustrate how Bayesian models of question-asking can leverage the statistics of language to capture human priors, while highlighting some shortcomings of pure LLMs as grounded reasoners.

red ship, ship, topleft, (16 more...)

2402.19471

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Government > Military > Navy (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Discovering robust biomarkers of neurological disorders from functional MRI using graph neural networks: A Review

Chan, Yi Hao, Girish, Deepank, Gupta, Sukrit, Xia, Jing, Kasi, Chockalingam, He, Yinan, Wang, Conghao, Rajapakse, Jagath C.

Graph neural networks (GNN) have emerged as a popular tool for modelling functional magnetic resonance imaging (fMRI) datasets. Many recent studies have reported significant improvements in disorder classification performance via more sophisticated GNN designs and highlighted salient features that could be potential biomarkers of the disorder. In this review, we provide an overview of how GNN and model explainability techniques have been applied on fMRI datasets for disorder prediction tasks, with a particular emphasis on the robustness of biomarkers produced for neurodegenerative diseases and neuropsychiatric disorders. We found that while most studies have performant models, salient features highlighted in these studies vary greatly across studies on the same disorder and little has been done to evaluate their robustness. To address these issues, we suggest establishing new standards that are based on objective evaluation metrics to determine the robustness of these potential biomarkers. We further highlight gaps in the existing literature and put together a prediction-attribution-evaluation framework that could set the foundations for future research on improving the robustness of potential biomarkers discovered via GNNs.

accuracy, dataset, graph, (17 more...)

2405.00577

Country:

Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

arXiv.org Machine LearningMay-1-2024

Variational Bayesian Methods for a Tree-Structured Stick-Breaking Process Mixture of Gaussians

Nakahara, Yuta

The Bayes coding algorithm for context tree source is a successful example of Bayesian tree estimation in text compression in information theory. This algorithm provides an efficient parametric representation of the posterior tree distribution and exact updating of its parameters. We apply this algorithm to a clustering task in machine learning. More specifically, we apply it to Bayesian estimation of the tree-structured stick-breaking process (TS-SBP) mixture models. For TS-SBP mixture models, only Markov chain Monte Carlo methods have been proposed so far, but any variational Bayesian methods have not been proposed yet. In this paper, we propose a variational Bayesian method that has a subroutine similar to the Bayes coding algorithm for context tree sources. We confirm its behavior by a numerical experiment on a toy example.

denote, node, probability distribution, (16 more...)

2405.00385

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Landry, Nicholas W., Thompson, William, Hébert-Dufresne, Laurent, Young, Jean-Gabriel

Complex contagions can outperform simple contagions for network reconstruction with dense networks or saturated dynamics

arXiv.org Machine LearningApr-30-2024

Network scientists often use complex dynamic processes to describe network contagions, but tools for fitting contagion models typically assume simple dynamics. Here, we address this gap by developing a nonparametric method to reconstruct a network and dynamics from a series of node states, using a model that breaks the dichotomy between simple pairwise and complex neighborhood-based contagions. We then show that a network is more easily reconstructed when observed through the lens of complex contagions if it is dense or the dynamic saturates, and that simple contagions are better otherwise.

complex contagion, contagion, network reconstruction, (13 more...)

2405.00129

Country: North America > United States > Vermont > Chittenden County > Burlington (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceApr-30-2024

A Unified Theory of Exact Inference and Learning in Exponential Family Latent Variable Models

Sokoloski, Sacha

Bayes' rule describes how to infer posterior beliefs about latent variables given observations, and inference is a critical step in learning algorithms for latent variable models (LVMs). Although there are exact algorithms for inference and learning for certain LVMs such as linear Gaussian models and mixture models, researchers must typically develop approximate inference and learning algorithms when applying novel LVMs. In this paper we study the line that separates LVMs that rely on approximation schemes from those that do not, and develop a general theory of exponential family, latent variable models for which inference and learning may be implemented exactly. Firstly, under mild assumptions about the exponential family form of a given LVM, we derive necessary and sufficient conditions under which the LVM prior is in the same exponential family as its posterior, such that the prior is conjugate to the posterior. We show that all models that satisfy these conditions are constrained forms of a particular class of exponential family graphical model. We then derive general inference and learning algorithms, and demonstrate them on a variety of example models. Finally, we show how to compose our models into graphical models that retain tractable inference and learning. In addition to our theoretical work, we have implemented our algorithms in a collection of libraries with which we provide numerous demonstrations of our theory, and with which researchers may apply our theory in novel statistical settings.

algorithm, exponential family, harmonium, (14 more...)

2404.19501

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

arXiv.org Artificial IntelligenceApr-30-2024

The Role of $n$-gram Smoothing in the Age of Neural Networks

Malagutti, Luca, Buinovskij, Andrius, Svete, Anej, Meister, Clara, Amini, Afra, Cotterell, Ryan

For nearly three decades, language models derived from the $n$-gram assumption held the state of the art on the task. The key to their success lay in the application of various smoothing techniques that served to combat overfitting. However, when neural language models toppled $n$-gram models as the best performers, $n$-gram smoothing techniques became less relevant. Indeed, it would hardly be an understatement to suggest that the line of inquiry into $n$-gram smoothing techniques became dormant. This paper re-opens the role classical $n$-gram smoothing techniques may play in the age of neural language models. First, we draw a formal equivalence between label smoothing, a popular regularization technique for neural language models, and add-$\lambda$ smoothing. Second, we derive a generalized framework for converting any $n$-gram smoothing technique into a regularizer compatible with neural language models. Our empirical results find that our novel regularizers are comparable to and, indeed, sometimes outperform label smoothing on language modeling and machine translation.

computational linguistic, language model, probability, (14 more...)

2403.1724

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > France (0.04)
(13 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)