AITopics | Undirected Networks

Collaborating Authors

Undirected Networks

News Overviews Instructional Materials AI-Alerts Classics

Entity Abstraction in Visual Model-Based Reinforcement Learning

Veerapaneni, Rishi, Co-Reyes, John D., Chang, Michael, Janner, Michael, Finn, Chelsea, Wu, Jiajun, Tenenbaum, Joshua B., Levine, Sergey

arXiv.org Machine LearningOct-29-2019

This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before. We present object-centric perception, prediction, and planning (OP3), which to the best of our knowledge is the first entity-centric dynamic latent variable framework for model-based reinforcement learning that acquires entity representations from raw visual observations without supervision and uses them to predict and plan. OP3 enforces entity-abstraction -- symmetric processing of each entity representation with the same locally-scoped function -- which enables it to scale to model different numbers and configurations of objects from those in training. Our approach to solving the key technical challenge of grounding these entity representations to actual objects in the environment is to frame this variable binding problem as an inference problem, and we developing an interactive inference algorithm that uses temporal continuity and interactive feedback to bind information about object properties to the entity variables. On block-stacking tasks, OP3 generalizes to novel block configurations and more objects than observed during training, outperforming an oracle model that assumes access to object supervision and achieving two to three times better accuracy than a state-of-the-art video prediction model.

arxiv, entity abstraction, representation, (15 more...)

arXiv.org Machine Learning

1910.12827

Country:

North America > United States > New York > Erie County > Buffalo (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
(2 more...)

Add feedback

On Connections between Constrained Optimization and Reinforcement Learning

Vieillard, Nino, Pietquin, Olivier, Geist, Matthieu

arXiv.org Machine LearningOct-29-2019

Dynamic Programming (DP) provides standard algorithms to solve Markov Decision Processes. However, these algorithms generally do not optimize a scalar objective function. In this paper, we draw connections between DP and (constrained) convex optimization. Specifically, we show clear links in the algorithmic structure between three DP schemes and optimization algorithms. We link Conservative Policy Iteration to Frank-Wolfe, Mirror-Descent Modified Policy Iteration to Mirror Descent, and Politex (Policy Iteration Using Expert Prediction) to Dual Averaging. These abstract DP schemes are representative of a number of (deep) Reinforcement Learning (RL) algorithms. By highlighting these connections (most of which have been noticed earlier, but in a scattered way), we would like to encourage further studies linking RL and convex optimization, that could lead to the design of new, more efficient, and better understood RL algorithms.

algorithm, gradient, iteration, (13 more...)

arXiv.org Machine Learning

1910.08476

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Certified Adversarial Robustness for Deep Reinforcement Learning

Lütjens, Björn, Everett, Michael, How, Jonathan P.

arXiv.org Artificial IntelligenceOct-28-2019

Deep Neural Network-based systems are now the state-of-the-art in many robotics tasks, but their application in safety-critical domains remains dangerous without formal guarantees on network robustness. Small perturbations to sensor inputs (from noise or adversarial examples) are often enough to change network-based decisions, which was already shown to cause an autonomous vehicle to swerve into oncoming traffic. In light of these dangers, numerous algorithms have been developed as defensive mechanisms from these adversarial inputs, some of which provide formal robustness guarantees or certificates. This work leverages research on certified adversarial robustness to develop an online certified defense for deep reinforcement learning algorithms. The proposed defense computes guaranteed lower bounds on state-action values during execution to identify and choose the optimal action under a worst-case deviation in input space due to possible adversaries or noise. The approach is demonstrated on a Deep Q-Network policy and is shown to increase robustness to noise and adversaries in pedestrian collision avoidance scenarios and a classic control task.

agent, perturbation, robustness, (16 more...)

arXiv.org Artificial Intelligence

1910.12908

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(6 more...)

Genre: Research Report (0.40)

Industry:

Transportation (0.69)
Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck

Igl, Maximilian, Ciosek, Kamil, Li, Yingzhen, Tschiatschek, Sebastian, Zhang, Cheng, Devlin, Sam, Hofmann, Katja

arXiv.org Machine LearningOct-28-2019

The ability for policies to generalize to new environments is key to the broad application of RL agents. A promising approach to prevent an agent's policy from overfitting to a limited set of training environments is to apply regularization techniques originally developed for supervised learning. However, there are stark differences between supervised learning and RL. We discuss those differences and propose modifications to existing regularization techniques in order to better adapt them to RL. In particular, we focus on regularization techniques relying on the injection of noise into the learned function, a family that includes some of the most widely used approaches such as Dropout and Batch Normalization. To adapt them to RL, we propose Selective Noise Injection (SNI), which maintains the regularizing effect the injected noise has, while mitigating the adverse effects it has on the gradient quality. Furthermore, we demonstrate that the Information Bottleneck (IB) is a particularly well suited regularization technique for RL as it is effective in the low-data regime encountered early on in training RL agents. Combining the IB with SNI, we significantly outperform current state of the art results, including on the recently proposed generalization benchmark Coinrun.

international conference, learning, regularization technique, (12 more...)

arXiv.org Machine Learning

1910.12911

Country:

Europe > France (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Large-Scale Characterization and Segmentation of Internet Path Delays with Infinite HMMs

Mouchet, Maxime, Vaton, Sandrine, Chonavel, Thierry, Aben, Emile, Hertog, Jasper den

arXiv.org Machine LearningOct-28-2019

Round-Trip Times are one of the most commonly collected performance metrics in computer networks. Measurement platforms such as RIPE Atlas provide researchers and network operators with an unprecedented amount of historical Internet delay measurements. It would be very useful to automate the processing of these measurements (statistical characterization of paths performance, change detection, recognition of recurring patterns, etc.). Humans are pretty good at finding patterns in network measurements but it can be difficult to automate this to enable many time series being processed at the same time. In this article we introduce a new model, the HDP-HMM or infinite hidden Markov model, whose performance in trace segmentation is very close to human cognition. This is obtained at the cost of a greater complexity and the ambition of this article is to make the theory accessible to network monitoring and management researchers. We demonstrate that this model provides very accurate results on a labeled dataset and on RIPE Atlas and CAIDA MANIC data. This method has been implemented in Atlas and we introduce the publicly accessible Web API.

hdp-hmm, mixture model, time sery, (9 more...)

arXiv.org Machine Learning

1910.12714

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry:

Telecommunications > Networks (1.00)
Information Technology > Networks (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

11 Alternatives To Keras For Deep Learning Enthusiasts

#artificialintelligenceOct-25-2019, 23:17:27 GMT

Infer.NET is a machine learning framework for running Bayesian inference in graphical models. It provides state-of-the-art message-passing algorithms and statistical routines needed to perform inference for a wide variety of applications. There are various intuitive features in this framework such as rich modelling language, multiple inference algorithms, designed for large scale inference as well as user-extendable. With the help of this framework, various Bayesian models such as Bayes Point Machine classifiers, TrueSkill matchmaking, hidden Markov models, and Bayesian networks can be implemented with ease.

deep learning enthusiast, inference, keras

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.75)

Add feedback

Online Gaussian LDA for Unsupervised Pattern Mining from Utility Usage Data

Mohamad, Saad, Bouchachia, Abdelhamid

arXiv.org Machine LearningOct-25-2019

Non-intrusive load monitoring (NILM) aims at separating a whole-home energy signal into its appliance components. Such method can be harnessed to provide various services to better manage and control energy consumption (optimal planning and saving). NILM has been traditionally approached from signal processing and electrical engineering perspectives. Recently, machine learning has started to play an important role in NILM. While most work has focused on supervised algorithms, unsupervised approaches can be more interesting and of practical use in real case scenarios. Specifically, they do not require labelled training data to be acquired from individual appliances and the algorithm can be deployed to operate on the measured aggregate data directly. In this paper, we propose a fully unsupervised NILM framework based on Bayesian hierarchical mixture models. In particular, we develop a new method based on Gaussian Latent Dirichlet Allocation (GLDA) in order to extract global components that summarise the energy signal. These components provide a representation of the consumption patterns. Designed to cope with big data, our algorithm, unlike existing NILM ones, does not focus on appliance recognition. To handle this massive data, GLDA works online. Another novelty of this work compared to the existing NILM is that the data involves different utilities (e.g, electricity, water and gas) as well as some sensors measurements. Finally, we propose different evaluation methods to analyse the results which show that our algorithm finds useful patterns.

algorithm, consumption, energy consumption, (13 more...)

arXiv.org Machine Learning

1910.11599

Country:

North America > Canada (0.04)
Europe > United Kingdom > England > Dorset > Poole (0.04)
Europe > United Kingdom > England > Dorset > Bournemouth (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
(2 more...)

Add feedback

On the convergence of projective-simulation-based reinforcement learning in Markov decision processes

Clausen, Jens, Boyajian, Walter L., Trenkwalder, Lea M., Dunjko, Vedran, Briegel, Hans J.

arXiv.org Artificial IntelligenceOct-25-2019

In recent years, the interest in leveraging quantum effects for enhancing machine learning tasks has significantly increased. Many algorithms speeding up supervised and unsupervised learning were established. The first framework in which ways to exploit quantum resources specifically for the broader context of reinforcement learning were found is projective simulation. Projective simulation presents an agent-based reinforcement learning approach designed in a manner which may support quantum walk-based speed-ups. Although classical variants of projective simulation have been benchmarked against common reinforcement learning algorithms, very few formal theoretical analyses have been provided for its performance in standard learning scenarios. In this paper, we provide a detailed formal discussion of the properties of this model. Specifically, we prove that one version of the projective simulation model, understood as a reinforcement learning approach, converges to optimal behavior in a large class of Markov decision processes. This proof shows that a physically-inspired approach to reinforcement learning can guarantee to converge.

agent, optimal policy, probability, (15 more...)

arXiv.org Artificial Intelligence

1910.11914

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)

Add feedback

On the geometry of learning neural quantum states

Park, Chae-Yeun, Kastoryano, Michael J.

arXiv.org Machine LearningOct-24-2019

Combining insights from machine learning and quantum Monte Carlo, the stochastic reconfiguration method with neural network Ansatz states is a promising new direction for high precision ground state estimation of quantum many body problems. At present, the method is heuristic, lacking a proper theoretical foundation. We initiate a thorough analysis of the learning landscape, and show that it reveals universal behavior reflecting a combination of the underlying physics and of the learning dynamics. In particular, the spectrum of the quantum Fisher matrix of complex restricted Boltzmann machine states can dramatically change across a phase transition. In contrast to the spectral properties of the quantum Fisher matrix, the actual weights of the network at convergence do not reveal much information about the system or the dynamics. Furthermore, we identify a new measure of correlation in the state by analyzing entanglement the eigenvectors. We show that, generically, the learning landscape modes with least entanglement have largest eigenvalue, suggesting that correlations are encoded in large flat valleys of the learning landscape, favoring stable representations of the ground state.

matrix, quantum fisher matrix, spectrum, (13 more...)

arXiv.org Machine Learning

1910.11163

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Cologne (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Task-Motion Planning for Navigation in Belief Space

Thomas, Antony, Mastrogiovanni, Fulvio, Baglietto, Marco

arXiv.org Artificial IntelligenceOct-24-2019

Task-Motion Planning for Navigation in Belief Space Antony Thomas, Fulvio Mastrogiovanni, and Marco Baglietto Abstract We present an integrated Task-Motion Planning (TMP) framework for navigation in large-scale environment. Autonomous robots operating in real world complex scenarios require planning in the discrete (task) space and the continuous (motion) space. In knowledge intensive domains, on the one hand, a robot has to reason at the highest-level, for example the regions to navigate to; on the other hand, the feasibility of the respective navigation tasks have to be checked at the execution level. This presents a need for motion-planning-aware task planners. We discuss a probabilistically complete approach that leverages this task-motion interaction for navigating in indoor domains, returning a plan that is optimal at the task-level. Furthermore, our framework is intended for motion planning under motion and sensing uncertainty, which is formally known as belief space planning. The underlying methodology is validated with a simulated office environment in Gazebo. In addition, we discuss the limitations and provide suggestions for improvements and future work. 1 Introduction Autonomous robots operating in complex real world scenarios require different levels of planning to execute their tasks. High-level (task) planning helps break down a given set of tasks into a sequence of sub-tasks. Actual execution of each of these sub-tasks would require low-level control actions to generate appropriate robot motions. In fact, the dependency between logical and geometrical aspects is pervasive in both task planning and execution.

motion planner, task planner, task-motion planning, (13 more...)

arXiv.org Artificial Intelligence

1910.11683

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback