AITopics

2005.05862

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > India > NCT > New Delhi (0.04)
Asia > India > NCT > Delhi (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology (0.88)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Shpak, Maryia, Miasojedow, Błażej, Rejchel, Wojciech

Structure learning for CTBN's via penalized maximum likelihood methods

The continuous-time Bayesian networks (CTBNs) represent a class of stochastic processes, which can be used to model complex phenomena, for instance, they can describe interactions occurring in living processes, in social science models or in medicine. The literature on this topic is usually focused on the case when the dependence structure of a system is known and we are to determine conditional transition intensities (parameters of the network). In the paper, we study the structure learning problem, which is a more challenging task and the existing research on this topic is limited. The approach, which we propose, is based on a penalized likelihood method. We prove that our algorithm, under mild regularity conditions, recognizes the dependence structure of the graph with high probability. We also investigate the properties of the procedure in numerical studies to demonstrate its effectiveness.

ctbn, graph, tructure learning, (14 more...)

2006.07648

Country:

North America > United States > New York (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Poland > Lublin Province > Lublin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Spencer, Neil A., Junker, Brian, Sweet, Tracy M.

Faster MCMC for Gaussian Latent Position Network Models

Latent position network models are a versatile tool in network science; applications include clustering entities, controlling for causal confounders, and defining priors over unobserved graphs. Estimating each node's latent position is typically framed as a Bayesian inference problem, with Metropolis within Gibbs being the most popular tool for approximating the posterior distribution. However, it is well-known that Metropolis within Gibbs is inefficient for large networks; the acceptance ratios are expensive to compute, and the resultant posterior draws are highly correlated. In this article, we propose an alternative Markov chain Monte Carlo strategy---defined using a combination of split Hamiltonian Monte Carlo and Firefly Monte Carlo---that leverages the posterior distribution's functional form for more efficient posterior computation. We demonstrate that these strategies outperform Metropolis within Gibbs and other algorithms on synthetic networks, as well as on real information-sharing networks of teachers and staff in a school district.

artificial intelligence, bayesian inference, machine learning, (18 more...)

2006.07687

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Maryland (0.04)

Genre: Research Report > New Finding (0.45)

Industry:

Education (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Kohns, David, Szendrei, Tibor

Horseshoe Prior Bayesian Quantile Regression

This paper extends the horseshoe prior of Carvalho et al. (2010) to the Bayesian quantile regression (HS-BQR) and provides a fast sampling algorithm that speeds up computation significantly in high dimensions. The performance of the HS-BQR is tested on large scale Monte Carlo simulations and an empirical application relevant to macroeoncomics. The Monte Carlo design considers several sparsity structures (sparse, dense, block) and error structures (i.i.d. errors and heteroskedastic errors). A number of LASSO based estimators (frequentist and Bayesian) are pitted against the HS-BQR to better gauge the performance of the method on the different designs. The HS-BQR yields just as good, or better performance than the other estimators considered when evaluated using coefficient bias and forecast error. We find that the HS-BQR is particularly potent in sparse designs and when estimating extreme quantiles. The simulations also highlight how the high dimensional quantile estimators fail to correctly identify the quantile function of the variables when both location and scale effects are present. In the empirical application, in which we evaluate forecast densities of US inflation, the HS-BQR provides well calibrated forecast densities whose individual quantiles, have the highest pseudo R squared, highlighting its potential for Value-at-Risk estimation.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

2006.07655

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Banking & Finance > Economy (0.93)

Technology:

Information Technology > Modeling & Simulation (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Uncertainty Estimation with Infinitesimal Jackknife, Its Distribution and Mean-Field Approximation

Lu, Zhiyun, Ie, Eugene, Sha, Fei

Uncertainty quantification is an important research area in machine learning. Many approaches have been developed to improve the representation of uncertainty in deep models to avoid overconfident predictions. Existing ones such as Bayesian neural networks and ensemble methods require modifications to the training procedures and are computationally costly for both training and inference. Motivated by this, we propose mean-field infinitesimal jackknife (mfIJ) -- a simple, efficient, and general-purpose plug-in estimator for uncertainty estimation. The main idea is to use infinitesimal jackknife, a classical tool from statistics for uncertainty estimation to construct a pseudo-ensemble that can be described with a closed-form Gaussian distribution, without retraining. We then use this Gaussian distribution for uncertainty estimation. While the standard way is to sample models from this distribution and combine each sample's prediction, we develop a mean-field approximation to the inference where Gaussian random variables need to be integrated with the softmax nonlinear functions to generate probabilities for multinomial variables. The approach has many appealing properties: it functions as an ensemble without requiring multiple models, and it enables closed-form approximate inference using only the first and second moments of Gaussians. Empirically, mfIJ performs competitively when compared to state-of-the-art methods, including deep ensembles, temperature scaling, dropout and Bayesian NNs, on important uncertainty tasks. It especially outperforms many methods on out-of-distribution detection.

approximation, artificial intelligence, machine learning, (14 more...)

2006.07584

Country:

North America > United States > California (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
(2 more...)

Javid, Kamran, Handley, Will, Hobson, Mike, Lasenby, Anthony

Compromise-free Bayesian neural networks

We conduct a thorough analysis of the relationship between the out-of-sample performance and the Bayesian evidence (marginal likelihood) of Bayesian neural networks (BNNs), as well as looking at the performance of ensembles of BNNs, both using the Boston housing dataset. Using the state-of-the-art in nested sampling, we numerically sample the full (non-Gaussian and multimodal) network posterior and obtain numerical estimates of the Bayesian evidence, considering network models with up to 156 trainable parameters. The networks have between zero and four hidden layers, either $\tanh$ or $ReLU$ activation functions, and with and without hierarchical priors. The ensembles of BNNs are obtained by determining the posterior distribution over networks, from the posterior samples of individual BNNs re-weighted by the associated Bayesian evidence values. There is good correlation between out-of-sample performance and evidence, as well as a remarkable symmetry between the evidence versus model size and out-of-sample performance versus model size planes. Networks with $ReLU$ activation functions have consistently higher evidences than those with $\tanh$ functions, and this is reflected in their out-of-sample performance. Ensembling over architectures acts to further improve performance relative to the individual BNNs.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2004.12211

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Lakin, Steven Michael, Abdo, Zaid

Fast Maximum Likelihood Estimation and Supervised Classification for the Beta-Liouville Multinomial

arXiv.org Machine LearningJun-12-2020

The multinomial and related distributions have long been used to model categorical, count-based data in fields ranging from bioinformatics to natural language processing. Commonly utilized variants include the standard multinomial and the Dirichlet multinomial distributions due to their computational efficiency and straightforward parameter estimation process. However, these distributions make strict assumptions about the mean, variance, and covariance between the categorical features being modeled. If these assumptions are not met by the data, it may result in poor parameter estimates and loss in accuracy for downstream applications like classification. Here, we explore efficient parameter estimation and supervised classification methods using an alternative distribution, called the Beta-Liouville multinomial, which relaxes some of the multinomial assumptions. We show that the Beta-Liouville multinomial is comparable in efficiency to the Dirichlet multinomial for Newton-Raphson maximum likelihood estimation, and that its performance on simulated data matches or exceeds that of the multinomial and Dirichlet multinomial distributions. Finally, we demonstrate that the Beta-Liouville multinomial outperforms the multinomial and Dirichlet multinomial on two out of four gold standard datasets, supporting its use in modeling data with low to medium class overlap in a supervised classification context.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2006.07454

Country:

North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
North America > United States > Colorado > Larimer County > Fort Collins (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Zhou, Jing, Claeskens, Gerda, Bradic, Jelena

Detangling robustness in high dimensions: composite versus model-averaged estimation

arXiv.org Machine LearningJun-12-2020

Robust methods, though ubiquitous in practice, are yet to be fully understood in the context of regularized estimation and high dimensions. Even simple questions become challenging very quickly. For example, classical statistical theory identifies equivalence between model-averaged and composite quantile estimation. However, little to nothing is known about such equivalence between methods that encourage sparsity. This paper provides a toolbox to further study robustness in these settings and focuses on prediction. In particular, we study optimally weighted model-averaged as well as composite $l_1$-regularized estimation. Optimal weights are determined by minimizing the asymptotic mean squared error. This approach incorporates the effects of regularization, without the assumption of perfect selection, as is often used in practice. Such weights are then optimal for prediction quality. Through an extensive simulation study, we show that no single method systematically outperforms others. We find, however, that model-averaged and composite quantile estimators often outperform least-squares methods, even in the case of Gaussian model noise. Real data application witnesses the method's practical use through the reconstruction of compressed audio signals.

artificial intelligence, estimator, machine learning, (18 more...)

2006.07457

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
(2 more...)

Zhi-Xuan, Tan, Mann, Jordyn L., Silver, Tom, Tenenbaum, Joshua B., Mansinghka, Vikash K.

Online Bayesian Goal Inference for Boundedly-Rational Planning Agents

arXiv.org Artificial IntelligenceJun-12-2020

People routinely infer the goals of others by observing their actions over time. Remarkably, we can do so even when those actions lead to failure, enabling us to assist others when we detect that they might not achieve their goals. How might we endow machines with similar capabilities? Here we present an architecture capable of inferring an agent's goals online from both optimal and non-optimal sequences of actions. Our architecture models agents as boundedly-rational planners that interleave search with execution by replanning, thereby accounting for sub-optimal behavior. These models are specified as probabilistic programs, allowing us to represent and perform efficient Bayesian inference over an agent's goals and internal planning processes. To perform such inference, we develop Sequential Inverse Plan Search (SIPS), a sequential Monte Carlo algorithm that exploits the online replanning assumption of these models, limiting computation by incrementally extending inferred plans as new actions are observed. We present experiments showing that this modeling and inference architecture outperforms Bayesian inverse reinforcement learning baselines, accurately inferring goals from both optimal and non-optimal trajectories involving failure and back-tracking, while generalizing across domains with compositional structure and sparse rewards.

artificial intelligence, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2006.07532

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (0.47)

Zugarová, Eliška, Guy, Tatiana V.

Similarity-based transfer learning of decision policies

arXiv.org Artificial IntelligenceJun-12-2020

A problem of learning decision policy from past experience is considered. Using the Fully Probabilistic Design (FPD) formalism, we propose a new general approach for finding a stochastic policy from the past data.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2006.08768

Country:

Europe > Czechia > Prague (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.64)

Industry: Energy (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)