AITopics

1612.04111

Country: North America > United States > Pennsylvania (0.28)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.48)

Industry: Education > Educational Setting > Online (0.85)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Şimşekli, Umut, Badeau, Roland, Cemgil, A. Taylan, Richard, Gaël

Stochastic Quasi-Newton Langevin Monte Carlo

arXiv.org Machine LearningDec-12-2016

Recently, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods have been proposed for scaling up Monte Carlo computations to large data problems. Whilst these approaches have proven useful in many applications, vanilla SG-MCMC might suffer from poor mixing rates when random variables exhibit strong couplings under the target densities or big scale differences. In this study, we propose a novel SG-MCMC method that takes the local geometry into account by using ideas from Quasi-Newton optimization methods. These second order methods directly approximate the inverse Hessian by using a limited history of samples and their gradients. Our method uses dense approximations of the inverse Hessian while keeping the time and memory complexities linear with the dimension of the problem. We provide a formal theoretical analysis where we show that the proposed method is asymptotically unbiased and consistent with the posterior expectations. We illustrate the effectiveness of the approach on both synthetic and real datasets. Our experiments on two challenging applications show that our method achieves fast convergence rates similar to Riemannian approaches while at the same time having low computational requirements similar to diagonal preconditioning approaches.

artificial intelligence, hamcmc, machine learning, (15 more...)

1602.03442

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.37)

#artificialintelligenceDec-11-2016, 02:15:20 GMT

Searching for the Master Algorithm - New Signature

It may sound trite, but humanity has come to dominate the world using this tool alone. Humans lack natural weapons, have no natural protection from the elements, and enter life as helpless infants. But our unique brains allow us to acquire, use, and communicate knowledge, and this advantage alone has allowed us to create the intricate social and technological reality we now inhabit. Our brains evolved to process, store, retrieve, and integrate sensory data into working knowledge that allows us to navigate reality. Until recently, humans were the only significant force that could translate raw data into accurate, actionable knowledge.

artificial intelligence, knowledge, machine learning, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.33)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.33)

#artificialintelligenceDec-10-2016, 17:50:23 GMT

Bayes Theorem: A Visual Introduction For Beginners

From Google search results to Netflix recommendations and investment strategies, Bayes Theorem (also often called Bayes Rule or Bayes Formula) is used across countless industries to help calculate and assess probability. Bayesian statistics is taught in most first-year statistics classes across the nation, but there is one major problem that many students (and others who are interested in the theorem) face. The theorem is not intuitive for most people, and understanding how it works can be a challenge, especially because it is often taught without visual aids. In this guide, we unpack the various components of the theorem and provide a basic overview of how it works – and with illustrations to help. Three scenarios – the flu, breathalyzer tests, and peacekeeping – are used throughout the booklet to teach how problems involving Bayes Theorem can be approached and solved.

artificial intelligence, bayes theorem, machine learning, (14 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.37)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningDec-10-2016

Improved prediction accuracy for disease risk mapping using Gaussian Process stacked generalisation

Bhatt, Samir, Cameron, Ewan, Flaxman, Seth R, Weiss, Daniel J, Smith, David L, Gething, Peter W

Maps of infectious disease---charting spatial variations in the force of infection, degree of endemicity, and the burden on human health---provide an essential evidence base to support planning towards global health targets. Contemporary disease mapping efforts have embraced statistical modelling approaches to properly acknowledge uncertainties in both the available measurements and their spatial interpolation. The most common such approach is that of Gaussian process regression, a mathematical framework comprised of two components: a mean function harnessing the predictive power of multiple independent variables, and a covariance function yielding spatio-temporal shrinkage against residual variation from the mean. Though many techniques have been developed to improve the flexibility and fitting of the covariance function, models for the mean function have typically been restricted to simple linear terms. For infectious diseases, known to be driven by complex interactions between environmental and socio-economic factors, improved modelling of the mean function can greatly boost predictive power. Here we present an ensemble approach based on stacked generalisation that allows for multiple, non-linear algorithmic mean functions to be jointly embedded within the Gaussian process framework. We apply this method to mapping Plasmodium falciparum prevalence data in Sub-Saharan Africa and show that the generalised ensemble approach markedly out-performs any individual method.

artificial intelligence, machine learning, modeling & simulation, (19 more...)

1612.03278

Country:

Africa (1.00)
North America > United States (0.93)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(2 more...)

Daskalakis, Constantinos, Pan, Qinxuan

Square Hellinger Subadditivity for Bayesian Networks and its Applications to Identity Testing

arXiv.org Machine LearningDec-9-2016

We show that the square Hellinger distance between two Bayesian networks on the same directed graph, $G$, is subadditive with respect to the neighborhoods of $G$. Namely, if $P$ and $Q$ are the probability distributions defined by two Bayesian networks on the same DAG, our inequality states that the square Hellinger distance, $H^2(P,Q)$, between $P$ and $Q$ is upper bounded by the sum, $\sum_v H^2(P_{\{v\} \cup \Pi_v}, Q_{\{v\} \cup \Pi_v})$, of the square Hellinger distances between the marginals of $P$ and $Q$ on every node $v$ and its parents $\Pi_v$ in the DAG. Importantly, our bound does not involve the conditionals but the marginals of $P$ and $Q$. We derive a similar inequality for more general Markov Random Fields. As an application of our inequality, we show that distinguishing whether two Bayesian networks $P$ and $Q$ on the same (but potentially unknown) DAG satisfy $P=Q$ vs $d_{\rm TV}(P,Q)>\epsilon$ can be performed from $\tilde{O}(|\Sigma|^{3/4(d+1)} \cdot n/\epsilon^2)$ samples, where $d$ is the maximum in-degree of the DAG and $\Sigma$ the domain of each variable of the Bayesian networks. If $P$ and $Q$ are defined on potentially different and potentially unknown trees, the sample complexity becomes $\tilde{O}(|\Sigma|^{4.5} n/\epsilon^2)$, whose dependence on $n, \epsilon$ is optimal up to logarithmic factors. Lastly, if $P$ and $Q$ are product distributions over $\{0,1\}^n$ and $Q$ is known, the sample complexity becomes $O(\sqrt{n}/\epsilon^2)$, which is optimal up to constant factors.

artificial intelligence, hellinger distance, machine learning, (18 more...)

1612.03164

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Wabersich, Kim Peter, Toussaint, Marc

Advancing Bayesian Optimization: The Mixed-Global-Local (MGL) Kernel and Length-Scale Cool Down

arXiv.org Machine LearningDec-9-2016

Bayesian Optimization (BO) has become a core method for solving expensive black-box optimization problems. While much research focussed on the choice of the acquisition function, we focus on online length-scale adaption and the choice of kernel function. Instead of choosing hyperparameters in view of maximum likelihood on past data, we propose to use the acquisition function to decide on hyperparameter adaptation more robustly and in view of the future optimization progress. Further, we propose a particular kernel function that includes non-stationarity and local anisotropy and thereby implicitly integrates the efficiency of local convex optimization with global Bayesian optimization. Comparisons to state-of-the art BO methods underline the efficiency of these mechanisms on global optimization benchmarks.

artificial intelligence, machine learning, optimization, (17 more...)

1612.03117

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

#artificialintelligenceDec-7-2016, 12:05:11 GMT

What Skills Are Artificial Intelligence Students Learning?

Uninformed Search: This is used when creating an action sequence that doesn't account for any changes along the way. Heuristic Functions: These allow for decisions to be made without accurate or complete information. Adversarial or Moving Agent Search: This is used when there are other entities making decisions that influence one another. Piotr Gmytrasiewicz, associate professor in the department of computer science at the University of Illinois at Chicago, teaches three courses: Artificial Intelligence 1, Artificial Intelligence 2 and Applied Artificial Intelligence.

artificial intelligence, artificial intelligence student learning, machine learning, (11 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.25)
North America > United States > Missouri (0.05)
Asia > India > Maharashtra > Mumbai (0.05)

Industry:

Education (1.00)
Automobiles & Trucks > Manufacturer (0.34)
Transportation > Passenger (0.32)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.32)

Song, Jun, Moore, David A.

Parallel Chromatic MCMC with Spatial Partitioning

arXiv.org Machine LearningDec-7-2016

We introduce a novel approach for parallelizing MCMC inference in models with spatially determined conditional independence relationships, for which existing techniques exploiting graphical model structure are not applicable. Our approach is motivated by a model of seismic events and signals, where events detected in distant regions are approximately independent given those in intermediate regions. We perform parallel inference by coloring a factor graph defined over regions of latent space, rather than individual model variables. Evaluating on a model of seismic event detection, we achieve significant speedups over serial MCMC with no degradation in inference quality.

artificial intelligence, inference, machine learning, (15 more...)

1612.00595

Country: North America > United States > California (0.46)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

#artificialintelligenceDec-6-2016, 19:20:34 GMT

Predicting with confidence: the best machine learning idea you never heard of

One of the disadvantages of machine learning as a discipline is the lack of reasonable confidence intervals on a given prediction. There are all kinds of reasons you might want such a thing, but I think machine learning and data science practitioners are so drunk with newfound powers, they forget where such a thing might be useful. If you're really confident, for example, that someone will click on an ad, you probably want to serve one that pays a nice click through rate. If you have some kind of gambling engine, you want to bet more money on the predictions you are more confident of. Or if you're diagnosing an illness in a patient, it would be awfully nice to be able to tell the patient how certain you are of the diagnosis and what the confidence in the prognosis is. There are various ad hoc ways that people do this sort of thing.

artificial intelligence, machine learning, prediction, (17 more...)

Country: North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)