AITopics

2309.02609

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
(2 more...)

arXiv.org Artificial IntelligenceDec-29-2023

Distributional Offline Policy Evaluation with Predictive Error Guarantees

Wu, Runzhe, Uehara, Masatoshi, Sun, Wen

We study the problem of estimating the distribution of the return of a policy using an offline dataset that is not generated from the policy, i.e., distributional offline policy evaluation (OPE). We propose an algorithm called Fitted Likelihood Estimation (FLE), which conducts a sequence of Maximum Likelihood Estimation (MLE) and has the flexibility of integrating any state-of-the-art probabilistic generative models as long as it can be trained via MLE. FLE can be used for both finite-horizon and infinite-horizon discounted settings where rewards can be multi-dimensional vectors. Our theoretical results show that for both finite-horizon and infinite-horizon discounted settings, FLE can learn distributions that are close to the ground truth under total variation distance and Wasserstein distance, respectively. Our theoretical results hold under the conditions that the offline data covers the test policy's traces and that the supervised learning MLE procedures succeed. Experimentally, we demonstrate the performance of FLE with two generative models, Gaussian mixture models and diffusion models. For the multi-dimensional reward setting, FLE with diffusion models is capable of estimating the complicated distribution of the return of a test policy.

algorithm, assumption 4, distributional offline policy evaluation, (12 more...)

2302.09456

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Senetaire, Hugo Henri Joseph, Garreau, Damien, Frellsen, Jes, Mattei, Pierre-Alexandre

Explainability as statistical inference

arXiv.org Artificial IntelligenceDec-29-2023

A wide variety of model explanation approaches have been proposed in recent years, all guided by very different rationales and heuristics. In this paper, we take a new route and cast interpretability as a statistical inference problem. We propose a general deep probabilistic model designed to produce interpretable predictions. The model parameters can be learned via maximum likelihood, and the method can be adapted to any predictor network architecture and any type of prediction problem. Our method is a case of amortized interpretability models, where a neural network is used as a selector to allow for fast interpretation at inference time. Several popular interpretability methods are shown to be particular cases of regularised maximum likelihood for our general model. We propose new datasets with ground truth selection which allow for the evaluation of the features importance map. Using these datasets, we show experimentally that using multiple imputation provides more reasonable interpretations.

dataset, imputation, selection, (17 more...)

2212.03131

Country:

Europe > Denmark (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Alpes-Maritimes > Nice (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

arXiv.org Machine LearningDec-28-2023

Inconsistency of cross-validation for structure learning in Gaussian graphical models

Lyu, Zhao, Tai, Wai Ming, Kolar, Mladen, Aragam, Bryon

Despite numerous years of research into the merits and trade-offs of various model selection criteria, obtaining robust results that elucidate the behavior of cross-validation remains a challenging endeavor. In this paper, we highlight the inherent limitations of cross-validation when employed to discern the structure of a Gaussian graphical model. We provide finite-sample bounds on the probability that the Lasso estimator for the neighborhood of a node within a Gaussian graphical model, optimized using a prediction oracle, misidentifies the neighborhood. Our results pertain to both undirected and directed acyclic graphs, encompassing general, sparse covariance structures. To support our theoretical findings, we conduct an empirical investigation of this inconsistency by contrasting our outcomes with other commonly used information criteria through an extensive simulation study. Given that many algorithms designed to learn the structure of graphical models require hyperparameter selection, the precise calibration of this hyperparameter is paramount for accurately estimating the inherent structure. Consequently, our observations shed light on this widely recognized practical challenge.

artificial intelligence, lasso solution, machine learning, (20 more...)

2312.17047

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

arXiv.org Machine LearningDec-28-2023

Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift

Eyre, Benjamin, Creager, Elliot, Madras, David, Papyan, Vardan, Zemel, Richard

Designing deep neural network classifiers that perform robustly on distributions differing from the available training data is an active area of machine learning research. However, out-of-distribution generalization for regression--the analogous problem for modeling continuous targets--remains relatively unexplored. To tackle this problem, we return to first principles and analyze how the closed-form solution for Ordinary Least Squares (OLS) regression is sensitive to covariate shift. We characterize the out-of-distribution risk of the OLS model in terms of the eigenspectrum decomposition of the source and target data. We then use this insight to propose a method for adapting the weights of the last layer of a pre-trained neural regression model to perform better on input data originating from a different distribution. We demonstrate how this lightweight spectral adaptation procedure can improve out-of-distribution performance for synthetic and real-world datasets.

egression, pectrally, spar, (15 more...)

2312.17463

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > France (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

arXiv.org Machine LearningDec-28-2023

Tractable Function-Space Variational Inference in Bayesian Neural Networks

Rudner, Tim G. J., Chen, Zonghao, Teh, Yee Whye, Gal, Yarin

Reliable predictive uncertainty estimation plays an important role in enabling the deployment of neural networks to safety-critical settings. A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters, infer an approximate posterior distribution, and use it to make stochastic predictions. However, explicit inference over neural network parameters makes it difficult to incorporate meaningful prior information about the data-generating process into the model. In this paper, we pursue an alternative approach. Recognizing that the primary object of interest in most settings is the distribution over functions induced by the posterior distribution over neural network parameters, we frame Bayesian inference in neural networks explicitly as inferring a posterior distribution over functions and propose a scalable function-space variational inference method that allows incorporating prior information and results in reliable predictive uncertainty estimates. We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks and demonstrate that it performs well on a challenging safety-critical medical diagnosis task in which reliable uncertainty estimation is essential.

dataset, inference, neural network, (15 more...)

2312.17199

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Javaheri, Amirhossein, Amini, Arash, Marvasti, Farokh, Palomar, Daniel P.

Joint Signal Recovery and Graph Learning from Incomplete Time-Series

arXiv.org Artificial IntelligenceDec-28-2023

Learning a graph from data is the key to taking advantage of graph signal processing tools. Most of the conventional algorithms for graph learning require complete data statistics, which might not be available in some scenarios. In this work, we aim to learn a graph from incomplete time-series observations. From another viewpoint, we consider the problem of semi-blind recovery of time-varying graph signals where the underlying graph model is unknown. We propose an algorithm based on the method of block successive upperbound minimization (BSUM), for simultaneous inference of the signal and the graph from incomplete data. Simulation results on synthetic and real time-series demonstrate the performance of the proposed method for graph learning and signal recovery.

algorithm, graph, matrix, (12 more...)

2312.1694

Country:

North America > United States > California (0.05)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceDec-28-2023

MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment

Wang, Ziyan, Du, Yali, Zhang, Yudi, Fang, Meng, Huang, Biwei

Offline Multi-agent Reinforcement Learning (MARL) is valuable in scenarios where online interaction is impractical or risky. While independent learning in MARL offers flexibility and scalability, accurately assigning credit to individual agents in offline settings poses challenges because interactions with an environment are prohibited. In this paper, we propose a new framework, namely Multi-Agent Causal Credit Assignment (MACCA), to address credit assignment in the offline MARL setting. Our approach, MACCA, characterizing the generative process as a Dynamic Bayesian Network, captures relationships between environmental variables, states, actions, and rewards. Estimating this model on offline data, MACCA can learn each agent's contribution by analyzing the causal relationship of their individual rewards, ensuring accurate and interpretable credit assignment. Additionally, the modularity of our approach allows it to seamlessly integrate with various offline MARL methods. Theoretically, we proved that under the setting of the offline dataset, the underlying causal structure and the function for generating the individual rewards of agents are identifiable, which laid the foundation for the correctness of our modeling. In our experiments, we demonstrate that MACCA not only outperforms state-of-the-art methods but also enhances performance when integrated with other backbones.

individual reward, offline multi-agent reinforcement learning, team reward, (8 more...)

2312.03644

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Rasul, Rownak Ara, Saha, Promy, Bala, Diponkor, Karim, S M Rakib Ul, Abdullah, Md. Ibrahim, Saha, Bishwajit

An Evaluation of Machine Learning Approaches for Early Diagnosis of Autism Spectrum Disorder

arXiv.org Artificial IntelligenceDec-28-2023

Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities. While its primary origin lies in genetics, early detection is crucial, and leveraging machine learning offers a promising avenue for a faster and more cost-effective diagnosis. This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process. We study eight state-of-the-art classification models to determine their effectiveness in ASD detection. We evaluate the models using accuracy, precision, recall, specificity, F1-score, area under the curve (AUC), kappa, and log loss metrics to find the best classifier for these binary datasets. Among all the classification models, for the children dataset, the SVM and LR models achieve the highest accuracy of 100% and for the adult dataset, the LR model produces the highest accuracy of 97.14%. Our proposed ANN model provides the highest accuracy of 94.24% for the new combined dataset when hyperparameters are precisely tuned for each model. As almost all classification models achieve high accuracy which utilize true labels, we become interested in delving into five popular clustering algorithms to understand model behavior in scenarios without true labels. We calculate Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Silhouette Coefficient (SC) metrics to select the best clustering models. Our evaluation finds that spectral clustering outperforms all other benchmarking clustering models in terms of NMI and ARI metrics while demonstrating comparability to the optimal SC achieved by k-means. The implemented code is available at GitHub.

accuracy, algorithm, dataset, (13 more...)

2309.11646

Country:

North America > United States > Missouri > Boone County > Columbia (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningDec-27-2023

Linear Complexity Gibbs Sampling for Generalized Labeled Multi-Bernoulli Filtering

Shim, Changbeom, Vo, Ba-Tuong, Vo, Ba-Ngu, Ong, Jonah, Moratuwage, Diluka

Generalized Labeled Multi-Bernoulli (GLMB) densities arise in a host of multi-object system applications analogous to Gaussians in single-object filtering. However, computing the GLMB filtering density requires solving NP-hard problems. To alleviate this computational bottleneck, we develop a linear complexity Gibbs sampling framework for GLMB density computation. Specifically, we propose a tempered Gibbs sampler that exploits the structure of the GLMB filtering density to achieve an $\mathcal{O}(T(P+M))$ complexity, where $T$ is the number of iterations of the algorithm, $P$ and $M$ are the number hypothesized objects and measurements. This innovation enables the GLMB filter implementation to be reduced from an $\mathcal{O}(TP^{2}M)$ complexity to $\mathcal{O}(T(P+M+\log T)+PM)$. Moreover, the proposed framework provides the flexibility for trade-offs between tracking performance and computational load. Convergence of the proposed Gibbs sampler is established, and numerical studies are presented to validate the proposed GLMB filter implementation.

complexity, implementation, signal process, (14 more...)

doi: 10.1109/TSP.2023.3277220

2211.16041

Country:

North America > United States > Massachusetts > Norfolk County > Norwood (0.04)
Oceania > Australia (0.04)
North America > United States > Connecticut > Tolland County > Storrs (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)