AITopics

1711.05957

Country:

North America > United States (0.93)
Asia (0.69)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Goyal, Anirudh, Sordoni, Alessandro, Côté, Marc-Alexandre, Ke, Nan Rosemary, Bengio, Yoshua

Z-Forcing: Training Stochastic Recurrent Networks

arXiv.org Machine LearningNov-16-2017

Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables. Source Code: \url{https://github.com/anirudh9119/zforcing_nips17}

artificial intelligence, machine learning, natural language, (16 more...)

1711.05411

Country:

Europe (0.46)
North America > Canada (0.14)

Genre: Research Report (0.64)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Kuleshov, Volodymyr, Ermon, Stefano

Neural Variational Inference and Learning in Undirected Graphical Models

arXiv.org Machine LearningNov-16-2017

Many problems in machine learning are naturally expressed in the language of undirected graphical models. Here, we propose black-box learning and inference algorithms for undirected models that optimize a variational approximation to the log-likelihood of the model. Central to our approach is an upper bound on the log-partition function parametrized by a function q that we express as a flexible neural network. Our bound makes it possible to track the partition function during learning, to speed-up sampling, and to train a broad class of hybrid directed/undirected models via a unified variational inference framework. We empirically demonstrate the effectiveness of our method on several popular generative modeling datasets.

artificial intelligence, machine learning, undirected model, (16 more...)

1711.02679

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Zhang, Cheng, Butepage, Judith, Kjellstrom, Hedvig, Mandt, Stephan

Advances in Variational Inference

Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. This approach has been successfully used in various models and large-scale applications. In this review, we give an overview of recent trends in variational inference. We first introduce standard mean field variational inference, then review recent advances focusing on the following aspects: (a) scalable VI, which includes stochastic approximations, (b) generic VI, which extends the applicability of VI to a large class of otherwise intractable models, such as non-conjugate models, (c) accurate VI, which includes variational models beyond the mean field approximation or with atypical divergences, and (d) amortized VI, which implements the inference over local latent variables with inference networks. Finally, we provide a summary of promising future research directions.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1711.05597

Country: Europe (0.46)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(3 more...)

Variational Adaptive-Newton Method for Explorative Learning

Khan, Mohammad Emtiyaz, Lin, Wu, Tangkaratt, Voot, Liu, Zuozhu, Nielsen, Didrik

We present the Variational Adaptive Newton (VAN) method which is a black-box optimization method especially suitable for explorative-learning tasks such as active learning and reinforcement learning. Similar to Bayesian methods, VAN estimates a distribution that can be used for exploration, but requires computations that are similar to continuous optimization methods. Our theoretical contribution reveals that VAN is a second-order method that unifies existing methods in distinct fields of continuous optimization, variational inference, and evolution strategies. Our experimental results show that VAN performs well on a wide-variety of learning tasks. This work presents a general-purpose explorative-learning method that has the potential to improve learning in areas such as active learning and reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1711.0556

Country: Asia > Japan (0.28)

Genre: Research Report > New Finding (0.89)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
(2 more...)

Obuchi, Tomoyuki, Kabashima, Yoshiyuki

Accelerating Cross-Validation in Multinomial Logistic Regression with $\ell_1$-Regularization

We develop an approximate formula for evaluating a cross-validation estimator of predictive likelihood for multinomial logistic regression regularized by an $\ell_1$-norm. This allows us to avoid repeated optimizations required for literally conducting cross-validation; hence, the computational time can be significantly reduced. The formula is derived through a perturbative approach employing the largeness of the data size and the model dimensionality. Its usefulness is demonstrated on simulated data and the ISOLET dataset from the UCI machine learning repository.

approximation, artificial intelligence, machine learning, (15 more...)

1711.0542

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:

Research Report > New Finding (0.72)
Research Report > Experimental Study (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Introducing DeepBalance: Random Deep Belief Network Ensembles to Address Class Imbalance

Xenopoulos, Peter

When solving practical classification problems, a practitioner may be faced with class imbalance, meaning that one class has a significantly higher prevalence than the others (also called the majority class). Examples of imbalanced classification problems in the literature include [1], [2], [3], [4]. Class imbalance problems may be exacerbated in the future as we discover new methods to collect rare data and rate of data collection increases. In many class imbalance problems, the minority class is not only the interest, but also carries the higher misclassification cost, which complicates learning [5]. Machine learning classifiers try to find an optimal decision boundary that fits training data. As classifiers generally seek to find the simplest rule that partitions the training data, the simplest rule in imbalanced settings is often always predicting the majority class [6]. Results can be deceptive for such classifiers, as they may achieve high accuracy. For example, in a problem where a minority class occurs 0.1% of the time, an uninformed classifier can achieve 99.9% accuracy by simply always predicting observations as the majority. Thus, the naturally occurring target class distribution is not optimal for learning in highly imbalanced scenarios [7], [8], [9], [10].

artificial intelligence, deepbalance, machine learning, (16 more...)

1709.10056

Country: North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Banking & Finance (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.43)

Wald-Kernel: Learning to Aggregate Information for Sequential Inference

Teng, Diyan, Ertin, Emre

Sequential hypothesis testing is a desirable decision making strategy in any time sensitive scenario. Compared with fixed sample-size testing, sequential testing is capable of achieving identical probability of error requirements using less samples in average. For a binary detection problem, it is well known that for known density functions accumulating the likelihood ratio statistics is time optimal under a fixed error rate constraint. This paper considers the problem of learning a binary sequential detector from training samples when density functions are unavailable. We formulate the problem as a constrained likelihood ratio estimation which can be solved efficiently through convex optimization by imposing Reproducing Kernel Hilbert Space (RKHS) structure on the log-likelihood ratio function. In addition, we provide a computationally efficient approximated solution for large scale data set. The proposed algorithm, namely Wald-Kernel, is tested on a synthetic data set and two real world data sets, together with previous approaches for likelihood ratio estimation. Our empirical results show that the classifier trained through the proposed technique achieves smaller average sampling cost than previous approaches proposed in the literature for the same error rate.

classifier, machine learning, reinforcement learning, (20 more...)

1508.07964

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
(2 more...)

@machinelearnbotNov-14-2017, 17:34:43 GMT

How Bayesian Networks Are Superior in Understanding Effects of Variables

Bayes Nets (or Bayesian Networks) give remarkable results in determining the effects of many variables on an outcome. They typically perform strongly even in cases when other methods falter or fail. These networks have had relatively little use with business-related problems, although they have worked successfully for years in fields such as scientific research, public safety, aircraft guidance systems and national defense. Importantly, they often outperform regression, particularly in determining variables' effects. Regression is one of the most august multivariate methods, and among the most studied and applied.

artificial intelligence, machine learning, regression, (18 more...)

@machinelearnbot

Country: North America > United States > California > San Francisco County > San Francisco (0.15)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Israelsen, Brett W, Ahmed, Nisar R

"Dave...I can assure you...that it's going to be all right..." -- A definition, case for, and survey of algorithmic assurances in human-autonomy trust relationships

arXiv.org Machine LearningNov-14-2017

As technology becomes more advanced, those who design, use and are otherwise affected by it want to know that it will perform correctly, and understand why it does what it does, and how to use it appropriately. In essence they want to be able to trust the systems that are being designed. In this survey we present assurances that are the method by which users can understand how to trust autonomous systems. Trust between humans and autonomy is reviewed, and the implications for the design of assurances are highlighted. A survey of existing research related to assurances is presented. Much of the surveyed research originates from fields such as interpretable, comprehensible, transparent, and explainable machine learning, as well as human-computer interaction, human-robot interaction, and e-commerce. Several key ideas are extracted from this work in order to refine the definition of assurances. The design of assurances is found to be highly dependent not only on the capabilities of the autonomous system, but on the characteristics of the human user, and the appropriate trust-related behaviors. Several directions for future research are identified and discussed.

assurance, machine learning, natural language, (20 more...)

1711.03846

Country:

Europe (0.67)
North America > United States > Pennsylvania (0.27)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine (1.00)
Government > Military (1.00)
Transportation (0.94)
(3 more...)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(7 more...)