AITopics

2006.11631

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Austria > Vienna (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Gupta, Kartik, Suggala, Arun Sai, Prasad, Adarsh, Netrapalli, Praneeth, Ravikumar, Pradeep

Learning Minimax Estimators via Online Learning

arXiv.org Machine LearningJun-19-2020

We consider the problem of designing minimax estimators for estimating the parameters of a probability distribution. Unlike classical approaches such as the MLE and minimum distance estimators, we consider an algorithmic approach for constructing such estimators. We view the problem of designing minimax estimators as finding a mixed strategy Nash equilibrium of a zero-sum game. By leveraging recent results in online learning with non-convex losses, we provide a general algorithm for finding a mixed-strategy Nash equilibrium of general non-convex non-concave zero-sum games. Our algorithm requires access to two subroutines: (a) one which outputs a Bayes estimator corresponding to a given prior probability distribution, and (b) one which computes the worst-case risk of any given estimator. Given access to these two subroutines, we show that our algorithm outputs both a minimax estimator and a least favorable prior. To demonstrate the power of this approach, we use it to construct provably minimax estimators for classical problems such as estimation in the finite Gaussian sequence model, and linear regression.

artificial intelligence, estimator, machine learning, (20 more...)

2006.1143

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
Asia > India (0.04)

Genre: Research Report (0.82)

Industry:

Education > Educational Setting > Online (0.60)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
(2 more...)

arXiv.org Machine LearningJun-19-2020

A Non-Iterative Quantile Change Detection Method in Mixture Model with Heavy-Tailed Components

Li, Yuantong, Ma, Qi, Ghosh, Sujit K.

Estimating parameters of mixture model has wide applications Determining the number of components in a finite mixture model ranging from classification problems to estimating of complex distributions. is crucial in many application areas such as financial data [16, 31, 35], Most of the current literature on estimating the parameters biomedical studies [17, 36] and low-frequency accident occurrence of the mixture densities are based on iterative Expectation Maximization prediction [27, 32]. Existing literature have witnessed numerous (EM) type algorithms which require the use of either computational methods, and in particular Markov Chain Monte taking expectations over the latent label variables or generating Carlo methods [7, 14, 33] and EM algorithms [20-22] have been samples from the conditional distribution of such latent labels using used with a lot of success. However, either these methods are computationally the Bayes rule. Moreover, when the number of components is demanding and/or these methods are developed under unknown, the problem becomes computationally more demanding the assumption of data being generated from mixtures of densities due to well-known label switching issues [28]. In this paper, we from the exponential family, in part because the family of exponential propose a robust and quick approach based on change-point methods distribution has a sufficient statistic of constant dimension (i.e., to determine the number of mixture components that works the dimension of the sufficient statistic remains fixed for any sample for almost any location-scale families even when the components size) and so the updates of the data augmentation type algorithm are heavy tailed (e.g., Cauchy). We present several numerical illustrations involve their smaller dimensional sufficient statistics [11, 12, 24].

artificial intelligence, machine learning, mixture model, (15 more...)

2006.11383

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.83)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Pleiss, Geoff, Jankowiak, Martin, Eriksson, David, Damle, Anil, Gardner, Jacob R.

Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization

arXiv.org Machine LearningJun-19-2020

Matrix square roots and their inverses arise frequently in machine learning, e.g., when sampling from high-dimensional Gaussians $\mathcal{N}(\mathbf 0, \mathbf K)$ or whitening a vector $\mathbf b$ against covariance matrix $\mathbf K$. While existing methods typically require $O(N^3)$ computation, we introduce a highly-efficient quadratic-time algorithm for computing $\mathbf K^{1/2} \mathbf b$, $\mathbf K^{-1/2} \mathbf b$, and their derivatives through matrix-vector multiplication (MVMs). Our method combines Krylov subspace methods with a rational approximation and typically achieves $4$ decimal places of accuracy with fewer than $100$ MVMs. Moreover, the backward pass requires little additional computation. We demonstrate our method's applicability on matrices as large as $50,\!000 \times 50,\!000$ - well beyond traditional methods - with little approximation error. Applying this increased scalability to variational Gaussian processes, Bayesian optimization, and Gibbs sampling results in more powerful models with higher accuracy.

artificial intelligence, bayesian inference, machine learning, (19 more...)

2006.11267

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Pennsylvania (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.50)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

#artificialintelligenceJun-18-2020, 06:29:40 GMT

Maximum likelihood estimation for Machine Learning - Nucleusbox

In the Logistic Regression for Machine Learning using Python blog, I have introduced the basic idea of the logistic function. We have discussed the cost function. And in the iterative method, we focus on the Gradient descent optimization method. Now so in this section, we are going to introduce the Maximum Likelihood cost function. And we would like to maximize this cost function.

artificial intelligence, bayesian inference, machine learning, (12 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.41)
Research Report > Experimental Study (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Asaadi, Erfan, Denney, Ewen, Pai, Ganesh

Quantifying Assurance in Learning-enabled Systems

arXiv.org Artificial IntelligenceJun-18-2020

Dependability assurance of systems embedding machine learning(ML) components---so called learning-enabled systems (LESs)---is a key step for their use in safety-critical applications. In emerging standardization and guidance efforts, there is a growing consensus in the value of using assurance cases for that purpose. This paper develops a quantitative notion of assurance that an LES is dependable, as a core component of its assurance case, also extending our prior work that applied to ML components. Specifically, we characterize LES assurance in the form of assurance measures: a probabilistic quantification of confidence that an LES possesses system-level properties associated with functional capabilities and dependability attributes. We illustrate the utility of assurance measures by application to a real world autonomous aviation system, also describing their role both in i) guiding high-level, runtime risk mitigation decisions and ii) as a core component of the associated dynamic assurance case.

artificial intelligence, assurance measure, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2006.10345

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Transportation > Air (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Government > Military (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Sensing and Signal Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(2 more...)

Wicker, Matthew, Laurenti, Luca, Patane, Andrea, Kwiatkowska, Marta

Probabilistic Safety for Bayesian Neural Networks

We study probabilistic safety for Bayesian Neural Networks (BNNs) under adversarial input perturbations. Given a compact set of input points, $T \subseteq \mathbb{R}^m$, we study the probability w.r.t. the BNN posterior that all the points in $T$ are mapped to the same region $S$ in the output space. In particular, this can be used to evaluate the probability that a network sampled from the BNN is vulnerable to adversarial attacks. We rely on relaxation techniques from non-convex optimization to develop a method for computing a lower bound on probabilistic safety for BNNs, deriving explicit procedures for the case of interval and linear function propagation techniques. We apply our methods to BNNs trained on a regression task, airborne collision avoidance, and MNIST, empirically showing that our approach allows one to certify probabilistic safety of BNNs with millions of parameters.

bnn, equation, probabilistic safety, (15 more...)

2004.10281

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry:

Transportation (0.34)
Information Technology (0.34)
Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Kodryan, Maxim, Kropotov, Dmitry, Vetrov, Dmitry

MARS: Masked Automatic Ranks Selection in Tensor Decompositions

For instance, Tucker (Tucker, Tensor decomposition methods have recently 1966) and canonical polyadic (CP) (Caroll & Chang, 1970) proven to be efficient for compressing and accelerating tensor decompositions are widely known for compressing neural networks. However, the problem and accelerating convolutional networks (Lebedev of optimal decomposition structure determination et al., 2015; Kim et al., 2016; Kossaifi et al., 2019), and is still not well studied while being quite important. Tensor Train (TT) (Oseledets, 2011) decomposition has Specifically, decomposition ranks present been successfully applied for compressing fully-connected the crucial parameter controlling the compressionaccuracy (FC) (Novikov et al., 2015), convolutional (Garipov et al., tradeoff. In this paper, we introduce 2016), recurrent (Yang et al., 2017; Yu et al., 2017), embedding MARS -- a new efficient method for the automatic (Khrulkov et al., 2019) layers.

artificial intelligence, bayesian inference, machine learning, (15 more...)

2006.10859

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Aushev, Alexander, Pesonen, Henri, Heinonen, Markus, Corander, Jukka, Kaski, Samuel

Likelihood-Free Inference with Deep Gaussian Processes

In recent years, surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations. The current state-of-the-art performance for this task has been achieved by Bayesian Optimization with Gaussian Processes (GPs). While this combination works well for unimodal target distributions, it is restricting the flexibility and applicability of Bayesian Optimization for accelerating likelihood-free inference more generally. We address this problem by proposing a Deep Gaussian Process (DGP) surrogate model that can handle more irregularly behaved target distributions. Our experiments show how DGPs can outperform GPs on objective functions with multimodal distributions and maintain a comparable performance in unimodal cases. This confirms that DGPs as surrogate models can extend the applicability of Bayesian Optimization for likelihood-free inference (BOLFI), while adding computational overhead that remains negligible for computationally intensive simulators.

artificial intelligence, machine learning, posterior, (18 more...)

2006.10571

Country:

Europe > Finland > Uusimaa > Helsinki (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(2 more...)

GAT-GMM: Generative Adversarial Training for Gaussian Mixture Models

Farnia, Farzan, Wang, William, Das, Subhro, Jadbabaie, Ali

Learning the distribution of observed data is a basic task in unsupervised learning which has been studied for decades. The recently-introduced concept of Generative Adversarial Networks (GANs) [1] has demonstrated great success in various distribution learning tasks. Unlike the traditional maximum-likelihood-based approaches, GANs learn the distribution of observed data through a zero-sum game between two machine players, a generator G mimicking the true distribution of data and a discriminator D distinguishing the generator's produced samples from real data points. This zero-sum game is typically formulated through a minimax optimization problem where G and D optimize a minimax objective quantifying how dissimilar G's generated samples and real training samples are. In GAN minimax optimization problems, the generator and discriminator functions are commonly chosen as two deep neural networks (DNNs). Leveraging the expressive power of DNNs, GANs have achieved state-of-the-art performance in learning complex distributions of image data [2, 3, 4]. This success, however, is achieved at the cost of their notoriously difficult training procedure which has introduced several challenges to the machine learning community. Addressing these challenges requires a deeper theoretical understanding of GANs, including their approximation, generalization, and optimization properties. Specifically, GANs have been frequently observed to fail in learning multi-modal distributions [5].

artificial intelligence, machine learning, tanh, (15 more...)

2006.10293

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)