AITopics

doi: 10.1049/iet-its.2017.0379

1606.01284

Country:

Asia > China (0.70)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceJun-3-2016

Thompson Sampling is Asymptotically Optimal in General Environments

Leike, Jan, Lattimore, Tor, Orseau, Laurent, Hutter, Marcus

We discuss a variant of Thompson sampling for nonparametric reinforcement learning in a countable classes of general stochastic environments. These environments can be non-Markov, non-ergodic, and partially observable. We show that Thompson sampling learns the environment class in the sense that (1) asymptotically its value converges to the optimal value in mean and (2) given a recoverability assumption regret is sublinear.

artificial intelligence, machine learning, thompson, (17 more...)

arXiv.org Artificial Intelligence

1602.07905

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)

@machinelearnbotJun-2-2016, 18:22:23 GMT

When we say PhD in NLP or PhD in bayesian networks or PhD in boosting, how all the topics listed below are related? • /r/MachineLearning

There are three different types of topics in machine learning, the first ones are like NLP, Computer vision, Robotics etc. and other ones are algorithms in machine learning like genetic algorithms, neural networks, bayesian networks etc and thirdly there are concepts like decision trees, random forest, PCA etc. So, how are all these topics related when I say PhD in Bayesian Networks or PhD in NLP or PhD in boosting etc?

decision tree learning, machine learning, phd, (5 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.81)

Flaxman, Seth, Sejdinovic, Dino, Cunningham, John P., Filippi, Sarah

Bayesian Learning of Kernel Embeddings

arXiv.org Machine LearningJun-2-2016

Kernel methods are one of the mainstays of machine learning, but the problem of kernel learning remains challenging, with only a few heuristics and very little theory. This is of particular importance in methods based on estimation of kernel mean embeddings of probability measures. For characteristic kernels, which include most commonly used ones, the kernel mean embedding uniquely determines its probability measure, so it can be used to design a powerful statistical testing framework, which includes nonparametric two-sample and independence tests. In practice, however, the performance of these tests can be very sensitive to the choice of kernel and its lengthscale parameters. To address this central issue, we propose a new probabilistic model for kernel mean embeddings, the Bayesian Kernel Embedding model, combining a Gaussian process prior over the Reproducing Kernel Hilbert Space containing the mean embedding with a conjugate likelihood function, thus yielding a closed form posterior over the mean embedding. The posterior mean of our model is closely related to recently proposed shrinkage estimators for kernel mean embeddings, while the posterior uncertainty is a new, interesting feature with various possible applications. Critically for the purposes of kernel learning, our model gives a simple, closed form marginal pseudolikelihood of the observed data given the kernel hyperparameters. This marginal pseudolikelihood can either be optimized to inform the hyperparameter choice or fully Bayesian inference can be used.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1603.0216

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Xu, Ning, Hong, Jian, Fisher, Timothy C. G.

Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso

Model selection is difficult to analyse yet theoretically and empirically important, especially for high-dimensional data analysis. Recently the least absolute shrinkage and selection operator (Lasso) has been applied in the statistical and econometric literature. Consis- tency of Lasso has been established under various conditions, some of which are difficult to verify in practice. In this paper, we study model selection from the perspective of generalization ability, under the framework of structural risk minimization (SRM) and Vapnik-Chervonenkis (VC) theory. The approach emphasizes the balance between the in-sample and out-of-sample fit, which can be achieved by using cross-validation to select a penalty on model complexity. We show that an exact relationship exists between the generalization ability of a model and model selection consistency. By implementing SRM and the VC inequality, we show that Lasso is L2-consistent for model selection under assumptions similar to those imposed on OLS. Furthermore, we derive a probabilistic bound for the distance between the penalized extremum estimator and the extremum estimator without penalty, which is dominated by overfitting. We also propose a new measurement of overfitting, GR2, based on generalization ability, that converges to zero if model selection is consistent. Using simulations, we demonstrate that the proposed CV-Lasso algorithm performs well in terms of model selection and overfitting control.

artificial intelligence, lasso, machine learning, (16 more...)

1606.00142

Country: North America > United States (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Guo, Zhaohan Daniel, Doroudi, Shayan, Brunskill, Emma

A PAC RL Algorithm for Episodic POMDPs

Many interesting real world domains involve reinforcement learning (RL) in partially observable environments. Efficient learning in such domains is important, but existing sample complexity bounds for partially observable RL are at least exponential in the episode length. We give, to our knowledge, the first partially observable RL algorithm with a polynomial bound on the number of episodes on which the algorithm may not achieve near-optimal performance. Our algorithm is suitable for an important class of episodic POMDPs. Our approach builds on recent advances in method of moments for latent variable model estimation.

artificial intelligence, machine learning, pomdp, (15 more...)

1605.08062

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Mnih, Andriy, Rezende, Danilo J.

Variational inference for Monte Carlo objectives

Recent progress in deep latent variable models has largely been driven by the development of flexible and scalable variational inference methods. Variational training of this type involves maximizing a lower bound on the log-likelihood, using samples from the variational posterior to compute the required gradients. Recently, Burda et al. (2016) have derived a tighter lower bound using a multi-sample importance sampling estimate of the likelihood and showed that optimizing it yields models that use more of their capacity and achieve higher likelihoods. This development showed the importance of such multisample objectives and explained the success of several related approaches. We extend the multi-sample approach to discrete latent variables and analyze the difficulty encountered when estimating the gradients involved. We then develop the first unbiased gradient estimator designed for importance-sampled objectives and evaluate it at training generative and structured output prediction models. The resulting estimator, which is based on low-variance per-sample learning signals, is both simpler and more effective than the NVIL estimator (Mnih & Gregor, 2014) proposed for the single-sample variational objective, and is competitive with the currently used biased estimators.

artificial intelligence, estimator, machine learning, (18 more...)

1602.06725

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.89)

Hernández-Lobato, José Miguel, Li, Yingzhen, Rowland, Mark, Hernández-Lobato, Daniel, Bui, Thang, Turner, Richard E.

Black-box $\alpha$-divergence Minimization

Black-box alpha (BB-$\alpha$) is a new approximate inference method based on the minimization of $\alpha$-divergences. BB-$\alpha$ scales to large datasets because it can be implemented using stochastic gradient descent. BB-$\alpha$ can be applied to complex probabilistic models with little effort since it only requires as input the likelihood function and its gradients. These gradients can be easily obtained using automatic differentiation. By changing the divergence parameter $\alpha$, the method is able to interpolate between variational Bayes (VB) ($\alpha \rightarrow 0$) and an algorithm similar to expectation propagation (EP) ($\alpha = 1$). Experiments on probit regression and neural network regression and classification problems show that BB-$\alpha$ with non-standard settings of $\alpha$, such as $\alpha = 0.5$, usually produces better predictions than with $\alpha \rightarrow 0$ (VB) or $\alpha = 1$ (EP).

artificial intelligence, machine learning, posterior, (19 more...)

1511.03243

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (1.00)

Industry:

Energy (0.68)
Transportation > Air (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

#artificialintelligenceMay-31-2016, 18:46:09 GMT

Implementing your own spam filter by Cambridge Coding Academy

This post teaches you how to implement your own spam filter in under 100 lines of Python code. While doing this hands-on exercise, you'll work with natural language data, learn how to detect the words spammers use automatically, and learn how to use a Naive Bayes classifier for binary classification. The task is to distinguish between two types of emails, "spam" and "non-spam" often called "ham". The machine learning classifier will detect that an email is spam if it is characterised by certain features. The textual content of the email – words like "Viagra" or "lottery" or phrases like "You've won a 100,000,000 dollars!

classifier, machine learning, spam filtering, (16 more...)

#artificialintelligence

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)

Genre: Instructional Material > Course Syllabus & Notes (0.75)

Technology:

Information Technology > Security & Privacy > Spam Filtering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Cusumano-Towner, Marco F, Mansinghka, Vikash K

Quantifying the probable approximation error of probabilistic inference programs

arXiv.org Machine LearningMay-31-2016

This paper introduces a new technique for quantifying the approximation error of a broad class of probabilistic inference programs, including ones based on both variational and Monte Carlo approaches. The key idea is to derive a subjective bound on the symmetrized KL divergence between the distribution achieved by an approximate inference program and its true target distribution. The bound's validity (and subjectivity) rests on the accuracy of two auxiliary probabilistic programs: (i) a "reference" inference program that defines a gold standard of accuracy and (ii) a "meta-inference" program that answers the question "what internal random choices did the original approximate inference program probably make given that it produced a particular result?" The paper includes empirical results on inference problems drawn from linear regression, Dirichlet process mixture modeling, HMMs, and Bayesian networks. The experiments show that the technique is robust to the quality of the reference inference program and that it can detect implementation bugs that are not apparent from predictive performance.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1606.00068

Country: North America > United States (0.93)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)