AITopics

2005.03718

Country:

North America > United States > Illinois > Cook County > Lemont (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)

Genre: Research Report (0.82)

Industry: Energy > Renewable > Solar (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Dann, Christoph, Mansour, Yishay, Mohri, Mehryar, Sekhari, Ayush, Sridharan, Karthik

Reinforcement Learning with Feedback Graphs

arXiv.org Artificial IntelligenceMay-7-2020

We study episodic reinforcement learning in Markov decision processes when the agent receives additional feedback per step in the form of several transition observations. Such additional observations are available in a range of tasks through extended sensors or prior knowledge about the environment (e.g., when certain actions yield similar outcome). We formalize this setting using a feedback graph over state-action pairs and show that model-based algorithms can leverage the additional feedback for more sample-efficient learning. We give a regret bound that, ignoring logarithmic factors and lower-order terms, depends only on the size of the maximum acyclic subgraph of the feedback graph, in contrast with a polynomial dependency on the number of states and actions in the absence of a feedback graph. Finally, we highlight challenges when leveraging a small dominating set of the feedback graph as compared to the bandit setting and propose a new algorithm that can use knowledge of such a dominating set for more sample-efficient learning of a near-optimal policy.

feedback graph, machine learning, reinforcement learning, (18 more...)

2005.03789

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Zheng, Kaiyu, Sung, Yoonchang, Konidaris, George, Tellex, Stefanie

Multi-Resolution POMDP Planning for Multi-Object Search in 3D

arXiv.org Artificial IntelligenceMay-6-2020

Robots operating in household environments must find objects on shelves, under tables, and in cupboards. Previous work often formulate the object search problem as a POMDP (Partially Observable Markov Decision Process), yet constrain the search space in 2D. We propose a new approach that enables the robot to efficiently search for objects in 3D, taking occlusions into account. We model the problem as an object-oriented POMDP, where the robot receives a volumetric observation from a viewing frustum and must produce a policy to efficiently search for objects. To address the challenge of large state and observation spaces, we first propose a per-voxel observation model which drastically reduces the observation size necessary for planning. Then, we present a novel octree-based belief representation which captures beliefs at different resolutions and supports efficient exact belief update. Finally, we design an online multi-resolution planning algorithm that leverages the resolution layers in the octree structure as levels of abstractions to the original POMDP problem. Our evaluation in a simulated 3D domain shows that, as the problem scales, our approach significantly outperforms baselines without resolution hierarchy by 25%-35% in cumulative reward. We demonstrate the practicality of our approach on a torso-actuated mobile robot searching for objects in areas of a cluttered lab environment where objects appear on surfaces at different heights.

artificial intelligence, machine learning, robot, (19 more...)

2005.02878

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceMay-6-2020

Building A User-Centric and Content-Driven Socialbot

Fang, Hao

To build Sounding Board, we develop a system architecture that is capable of accommodating dialog strategies that we designed for socialbot conversations. The architecture consists of a multi-dimensional language understanding module for analyzing user utterances, a hierarchical dialog management framework for dialog context tracking and complex dialog control, and a language generation process that realizes the response plan and makes adjustments for speech synthesis. Additionally, we construct a new knowledge base to power the socialbot by collecting social chat content from a variety of sources. An important contribution of the system is the synergy between the knowledge base and the dialog management, i.e., the use of a graph structure to organize the knowledge base that makes dialog control very efficient in bringing related content to the discussion. Using the data collected from Sounding Board during the competition, we carry out in-depth analyses of socialbot conversations and user ratings which provide valuable insights in evaluation methods for socialbots. We additionally investigate a new approach for system evaluation and diagnosis that allows scoring individual dialog segments in the conversation. Finally, observing that socialbots suffer from the issue of shallow conversations about topics associated with unstructured data, we study the problem of enabling extended socialbot conversations grounded on a document. To bring together machine reading and dialog control techniques, a graph-based document representation is proposed, together with methods for automatically constructing the graph. Using the graph-based representation, dialog control can be carried out by retrieving nodes or moving along edges in the graph. To illustrate the usage, a mixed-initiative dialog strategy is designed for socialbot conversations on news articles.

dependency constituency dependency figure 5, information retrieval, machine learning, (22 more...)

2005.02623

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.13)
Asia > Taiwan (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Personal > Interview (1.00)
Overview (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Law (1.00)
(9 more...)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
(9 more...)

Journal of Artificial Intelligence ResearchMay-5-2020

Variational Bayes In Private Settings (VIPS)

Park, Mijung (Max Planck Institute for Intelligent Systems) | Foulds, James | Chaudhuri, Kamalika | Welling, Max

Many applications of Bayesian data analysis involve sensitive information such as personal documents or medical records, motivating methods which ensure that privacy is protected. We introduce a general privacy-preserving framework for Variational Bayes (VB), a widely used optimization-based Bayesian inference method. Our framework respects differential privacy, the gold-standard privacy criterion, and encompasses a large class of probabilistic models, called the Conjugate Exponential (CE) family. We observe that we can straightforwardly privatise VB's approximate posterior distributions for models in the CE family, by perturbing the expected sufficient statistics of the complete-data likelihood. For a broadly-used class of non-CE models, those with binomial likelihoods, we show how to bring such models into the CE family, such that inferences in the modified model resemble the private variational Bayes algorithm as closely as possible, using the Pólya-Gamma data augmentation scheme. The iterative nature of variational Bayes presents a further challenge since iterations increase the amount of noise needed. We overcome this by combining: (1) an improved composition method for differential privacy, called the moments accountant, which provides a tight bound on the privacy cost of multiple VB iterations and thus significantly decreases the amount of additive noise; and (2) the privacy amplification effect of subsampling mini-batches from large-scale data in stochastic learning. We empirically demonstrate the effectiveness of our method in CE and non-CE models including latent Dirichlet allocation, Bayesian logistic regression, and sigmoid belief networks, evaluated on real-world datasets.

data mining, machine learning, natural language, (18 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11763

AI Access Foundation

11763

Journal of Artificial Intelligence Research

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Maryland > Baltimore (0.14)
Asia > Middle East > Jordan (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Guerra, Anna, Guidi, Francesco, Dardari, Davide, Djuric, Petar M.

Reinforcement Learning for UAV Autonomous Navigation, Mapping and Target Detection

arXiv.org Machine LearningMay-5-2020

In this paper, we study a joint detection, mapping and navigation problem for a single unmanned aerial vehicle (UAV) equipped with a low complexity radar and flying in an unknown environment. The goal is to optimize its trajectory with the purpose of maximizing the mapping accuracy and, at the same time, to avoid areas where measurements might not be sufficiently informative from the perspective of a target detection. This problem is formulated as a Markov decision process (MDP) where the UAV is an agent that runs either a state estimator for target detection and for environment mapping, and a reinforcement learning (RL) algorithm to infer its own policy of navigation (i.e., the control law). Numerical results show the feasibility of the proposed idea, highlighting the UAV's capability of autonomously exploring areas with high probability of target detection while reconstructing the surrounding environment.

eirp 5, machine learning, reinforcement learning, (17 more...)

doi: 10.1109/PLANS46316.2020.9110163

2005.05057

Country:

Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.05)
North America > United States > New York > Suffolk County > Stony Brook (0.04)

Genre: Research Report (0.70)

Industry:

Information Technology > Robotics & Automation (0.34)
Aerospace & Defense > Aircraft (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Çakmak, Burak, Opper, Manfred

A Dynamical Mean-Field Theory for Learning in Restricted Boltzmann Machines

arXiv.org Machine LearningMay-4-2020

We define a message-passing algorithm for computing magnetization s in Restricted Boltzmann machines, which are Ising models on bipartite g raphs introduced as neural network models for probability distributions over spin con figurations. To model nontrivial statistical dependencies between the spins' couplings, we assume that the rectangular coupling matrix is drawn from an arbitrary bi-rotation in variant random matrix ensemble. Using the dynamical functional method of statist ical mechanics we exactly analyze the dynamics of the algorithm in the large system limit. We prove the global convergence of the algorithm under a stability criterion and c ompute asymptotic convergence rates showing excellent agreement with numerical sim ulations.

artificial intelligence, dynamical mean-field theory, machine learning, (18 more...)

2005.0156

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Colorado (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

arXiv.org Machine LearningMay-4-2020

Connecting the Dots: Towards Continuous Time Hamiltonian Monte Carlo

Kleppe, Tore Selland

Continuous time Hamiltonian Monte Carlo is introduced, as a powerful alternative to Markov chain Monte Carlo methods for continuous target distributions. The method is constructed in two steps: First Hamiltonian dynamics are chosen as the deterministic dynamics in a continuous time piecewise deterministic Markov process. Under very mild restrictions, such a process will have the desired target distribution as an invariant distribution. Secondly, the numerical implementation of such processes, based on adaptive numerical integration of second order ordinary differential equations is considered. The numerical implementation yields an approximate, yet highly robust algorithm that, unlike conventional Hamiltonian Monte Carlo, enables the exploitation of the complete Hamiltonian trajectories (and hence the title). The proposed algorithm may yield large speedups and improvements in stability relative to relevant benchmarks, while incurring numerical errors that are negligible relative to the overall Monte Carlo errors.

artificial intelligence, machine learning, target distribution, (16 more...)

2005.01285

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Li, Zengyi, Sommer, Friedrich T.

Complex Amplitude-Phase Boltzmann Machines

arXiv.org Machine LearningMay-4-2020

We extend the framework of Boltzmann machines to a network of complex-valued neurons with variable amplitudes, referred to as Complex Amplitude-Phase Boltzmann machine (CAP-BM). The model is capable of performing unsupervised learning on the amplitude and relative phase distribution in complex data. The sampling rule of the Gibbs distribution and the learning rules of the model are presented. Learning in a Complex Amplitude-Phase restricted Boltzmann machine (CAP-RBM) is demonstrated on synthetic complex-valued images, and handwritten MNIST digits transformed by a complex wavelet transform. Specifically, we show the necessity of a new amplitude-amplitude coupling term in our model. The proposed model is potentially valuable for machine learning tasks involving complex-valued data with amplitude variation, and for developing algorithms for novel computation hardware, such as coupled oscillators and neuromorphic hardware, on which Boltzmann sampling can be executed in the complex domain.

artificial intelligence, boltzmann machine, machine learning, (14 more...)

2005.01862

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Colorado (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
Asia > Japan > Honshū > Kantō > Tochigi Prefecture > Utsunomiya (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)

arXiv.org Artificial IntelligenceMay-4-2020

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

Levine, Sergey, Kumar, Aviral, Tucker, George, Fu, Justin

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize previously collected data, without additional online data collection. Offline reinforcement learning algorithms hold tremendous promise for making it possible to turn large datasets into powerful decision making engines. Effective offline reinforcement learning methods would be able to extract policies with the maximum possible utility out of the available data, thereby allowing automation of a wide range of decision-making domains, from healthcare and education to robotics. However, the limitations of current algorithms make this difficult. We will aim to provide the reader with an understanding of these challenges, particularly in the context of modern deep reinforcement learning methods, and describe some potential solutions that have been explored in recent work to mitigate these challenges, along with recent applications, and a discussion of perspectives on open problems in the field.

machine learning, reinforcement, reinforcement learning, (17 more...)

2005.01643

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation (1.00)
Health & Medicine > Therapeutic Area (1.00)
Information Technology (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)