AITopics

1507.05117

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Hernández-Lobato, Daniel, Hernández-Lobato, José Miguel

Scalable Gaussian Process Classification via Expectation Propagation

arXiv.org Machine LearningJul-16-2015

Variational methods have been recently considered for scaling the training process of Gaussian process classifiers to large datasets. As an alternative, we describe here how to train these classifiers efficiently using expectation propagation. The proposed method allows for handling datasets with millions of data instances. More precisely, it can be used for (i) training in a distributed fashion where the data instances are sent to different nodes in which the required computations are carried out, and for (ii) maximizing an estimate of the marginal likelihood using a stochastic approximation of the gradient. Several experiments indicate that the method described is competitive with the variational approach.

approximation, artificial intelligence, machine learning, (18 more...)

1507.04513

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Modeling & Simulation (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningJul-16-2015

On the Convergence of Stochastic Variational Inference in Bayesian Networks

Paquet, Ulrich

We highlight a pitfall when applying stochastic variational inference to general Bayesian networks. For global random variables approximated by an exponential family distribution, natural gradient steps, commonly starting from a unit length step size, are averaged to convergence. This useful insight into the scaling of initial step sizes is lost when the approximation factorizes across a general Bayesian network, and care must be taken to ensure practical convergence. We experimentally investigate how much of the baby (well-scaled steps) is thrown out with the bath water (exact gradients).

artificial intelligence, bayesian inference, machine learning, (13 more...)

1507.04505

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)

Melchior, Jan, Fischer, Asja, Wiskott, Laurenz

How to Center Binary Deep Boltzmann Machines

arXiv.org Machine LearningJul-16-2015

This work analyzes centered binary Restricted Boltzmann Machines (RBMs) and binary Deep Boltzmann Machines (DBMs), where centering is done by subtracting offset values from visible and hidden variables. We show analytically that (i) centering results in a different but equivalent parameterization for artificial neural networks in general, (ii) the expected performance of centered binary RBMs/DBMs is invariant under simultaneous flip of data and offsets, for any offset value in the range of zero to one, (iii) centering can be reformulated as a different update rule for normal binary RBMs/DBMs, and (iv) using the enhanced gradient is equivalent to setting the offset values to the average over model and data mean. Furthermore, numerical simulations suggest that (i) optimal generative performance is achieved by subtracting mean values from visible as well as hidden variables, (ii) centered RBMs/DBMs reach significantly higher log-likelihood values than normal binary RBMs/DBMs, (iii) centering variants whose offsets depend on the model mean, like the enhanced gradient, suffer from severe divergence problems, (iv) learning is stabilized if an exponentially moving average over the batch means is used for the offset values instead of the current batch mean, which also prevents the enhanced gradient from diverging, (v) centered RBMs/DBMs reach higher LL values than normal RBMs/DBMs while having a smaller norm of the weight matrix, (vi) centering leads to an update direction that is closer to the natural gradient and that the natural gradient is extremly efficient for training RBMs, (vii) centering dispense the need for greedy layer-wise pre-training of DBMs, (viii) furthermore we show that pre-training often even worsen the results independently whether centering is used or not, and (ix) centering is also beneficial for auto encoders.

artificial intelligence, deep learning, machine learning, (19 more...)

1311.1354

Country: North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Hernández-Lobato, José Miguel, Adams, Ryan P.

Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks

arXiv.org Machine LearningJul-15-2015

Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.

approximation, bayesian inference, neural network, (18 more...)

1502.05336

Country:

North America > United States > Massachusetts (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Learning Behaviors in Agents Systems with Interactive Dynamic Influence Diagrams

Conroy, Ross (Teesside University) | Zeng, Yifeng (Teesside University) | Cavazza, Marc (Teesside University) | Chen, Yingke (University of Georgia)

Interactive dynamic influence diagrams(I-DIDs) are a well recognized decision model that explicitly considers how multiagent interaction affects individual decision making. To predict behavior of other agents, I-DIDs require models of the other agents to be known ahead of time and manually encoded. This becomes a barrier to I-DID applications in a human-agent interaction setting, such as development of intelligent non-player characters(NPCs) in real-time strategy(RTS) games, where models of other agents or human players are often inaccessible to domain experts. In this paper, we use automatic techniques for learning behavior of other agents from replay data in RTS games. We propose a learning algorithm with improvement over existing work by building a full profile of agent behavior. This is the first time that data-driven learning techniques are embedded into the I-DID decision making framework. We evaluate the performance of our approach on two test cases.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > North Yorkshire > Middlesbrough (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Georgia > Clarke County > Athens (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Leisure & Entertainment > Games > Computer Games (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Adaptive Dropout Rates for Learning with Corrupted Features

Zhuo, Jingwei (Tsinghua University) | Zhu, Jun (Tsinghua University) | Zhang, Bo (Tsinghua University)

Feature noising is an effective mechanism on reducing the risk of overfitting. To avoid an explosive searching space, existing work typically assumes that all features share a single noise level, which is often cross-validated. In this paper, we present a Bayesian feature noising model that flexibly allows for dimension-specific or group-specific noise levels, and we derive a learning algorithm that adaptively updates these noise levels. Our adaptive rule is simple and interpretable, by drawing a direct connection to the fitness of each individual feature or feature group. Empirical results on various datasets demonstrate the effectiveness on avoiding extensive tuning and sometimes improving the performance due to its flexibility.

classification, dropout level, noise level, (16 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > China > Jiangsu Province > Xuzhou (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

CoBots: Robust Symbiotic Autonomous Mobile Service Robots

Veloso, Manuela (Carnegie Mellon University) | Biswas, Joydeep (Carnegie Mellon University) | Coltin, Brian (Carnegie Mellon University) | Rosenthal, Stephanie (Carnegie Mellon University)

We research and develop autonomous mobile service robots as Collaborative Robots, i.e., CoBots. For the last three years, our four CoBots have autonomously navigated in our multi-floor office buildings for more than 1,000km, as the result of the integration of multiple perceptual, cognitive, and actuations representations and algorithms. In this paper, we identify a few core aspects of our CoBots underlying their robust functionality. The reliable mobility in the varying indoor environments comes from a novel episodic non-Markov localization. Service tasks requested by users are the input to a scheduler that can consider different types of constraints, including transfers among multiple robots. With symbiotic autonomy, the CoBots proactively seek external sources of help to fill-in for their inevitable occasional limitations. We present sampled results from a deployment and conclude with a brief review of other features of our service robots.

cobot, robot, veloso, (14 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Overview (0.34)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Irissappane, Athirai A. (Nanyang Technological University)

Exploiting Trust Information to Cope with Malicious Entities in Multi-Agent Systems

Our research is within the area of artificial intelligence and multi-agent systems. More specifically, we focus on evaluating trust relationships between the agents in multi-agent e-marketplaces and sensor networks and aim to address the following problems: 1) how to identify a trustworthy (good quality) agent; 2) how to cope with dishonest advisors i.e., agents who provide misleading opinions about others.

node, pomdp, sensor node, (14 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country: Asia > Singapore (0.05)

Industry: Information Technology > Security & Privacy (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Inapproximability of Treewidth and Related Problems (Extended Abstract)

Wu, Yu (Facebook AI Research Lab) | Austrin, Per (KTH Royal Insititute of Technology) | Pitassi, Toniann (University of Toronto) | Liu, David (University of Toronto)

Graphical models, such as Bayesian Networks and Markov networks play an important role in artificial intelligence and machine learning. Inference is a central problem to be solved on these networks. This, and other problems on these graph models are often known to be hard to solve in general, but tractable on graphs with bounded Treewidth. Therefore, finding or approximating the Treewidth of a graph is a fundamental problem related to inference in graphical models. In this paper, we study the approximability of a number of graph problems: Treewidth and Pathwidth of graphs, Minimum Fill-In, and a variety of different graph layout problems such as Minimum Cut Linear Arrangement. We show that, assuming Small Set Expansion Conjecture, all of these problems are NP-hard to approx- imate to within any constant factor in polynomial time.

algorithm, graph, treewidth, (15 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Afghanistan > Parwan Province > Charikar (0.05)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)