AITopics

Luck, Kevin Sebastian, Vecerik, Mel, Stepputtis, Simon, Amor, Heni Ben, Scholz, Jonathan

Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient

Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient Kevin Sebastian Luck 1, Mel V ecerik 2, Simon Stepputtis 1, Heni Ben Amor 1 and Jonathan Scholz 2 Abstract -- Model-free reinforcement learning algorithms such as Deep Deterministic Policy Gradient (DDPG) often require additional exploration strategies, especially if the actor is of deterministic nature. This work evaluates the use of model-based trajectory optimization methods used for exploration in Deep Deterministic Policy Gradient when trained on a latent image embedding. In addition, an extension of DDPG is derived using a value function as critic, making use of a learned deep dynamics model to compute the policy gradient. This approach leads to a symbiotic relationship between the deep reinforcement learning algorithm and the latent trajectory optimizer . The trajectory optimizer benefits from the critic learned by the RL algorithm and the latter from the enhanced exploration generated by the planner . The developed methods are evaluated on two continuous control tasks, one in simulation and one in the real world. In particular, a Baxter robot is trained to perform an insertion task, while only receiving sparse rewards and images as observations from the environment. I NTRODUCTION Reinforcement learning (RL) methods enabled the development of autonomous systems that can autonomously learn and master a task when provided with an objective function. RL has been successfully applied to a wide range of tasks including flying [24], [17], manipulation [26], [9], [12], [3], [1], locomotion [10], [13], and even autonomous driving [6], [7].

artificial intelligence, exploration, upstream oil & gas, (16 more...)

1911.06833

Country:

North America > United States > Arizona (0.14)
Europe > Switzerland (0.14)

Genre: Research Report (0.82)

Industry:

Energy > Oil & Gas > Upstream (0.36)
Transportation (0.34)
Information Technology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Enforcing Deterministic Constraints on Generative Adversarial Networks for Emulating Physical Systems

Yang, Zeng, Wu, Jin-Long, Xiao, Heng

Generative adversarial networks (GANs) are initially proposed to generate images by learning from a large number of samples. Recently, GANs have been used to emulate complex physical systems such as turbulent flows. However, a critical question must be answered before GANs can be considered trusted emulators for physical systems: do GANs-generated samples conform to the various physical constraints? These include both deterministic constraints (e.g., conservation laws) and statistical constraints (e.g., energy spectrum in turbulent flows). The latter have been studied in a companion paper (Wu et al. 2019. In the present work, we enforce deterministic yet approximate constraints on GANs by incorporating them into the loss function of the generator. We evaluate the performance of physics-constrained GANs on two representative tasks with geometrical constraints (generating points on circles) and differential constraints (generating divergence-free flow velocity fields), respectively. In both cases, the constrained GANs produced samples that precisely conform to the underlying constraints, even though the constraints are only enforced approximately. More importantly, the imposed constraints significantly accelerate the convergence and improve the robustness in the training. These improvements are noteworthy, as the convergence and robustness are two well-known obstacles in the training of GANs. Keywords: Generative adversarial networks, physics constraints, physics-informed machine learning 1. Introduction Machine learning and particularly deep learning has achieved significant success in a wide range of commercial domain applications such as image recognition, audio recognition, and natural language processing [1-5]. Corresponding author Email address: hengxiao@vt.edu For example, machine learning methods such as random forests and neural networks have been used to provide closure models for turbulent flows [6-9] and multiphase flows [10, 11] and to compute rock permeability directly from CT scan images [12]. They have also been used to discover ordinary and partial differential equations (ODEs and PDEs) from data [13-16]. Finally, neural networks have been used to solve exactly specified PDEs [17-20] and partially known PDEs by incorporating available data [21-24]. The scientific applications reviewed above mostly involve supervised learning, which consists of three steps: (a) postulate a model that maps inputs (features) to outputs (labels), controlled by a set of adjustable model parameters; (b) learn the parameters from training data (labeled examples of input-output pairs); and (c) use the fitted model to predict the responses for new inputs that were not included in the training data.

constraint, deep learning, upstream oil & gas, (20 more...)

1911.06671

Country:

North America > United States > Virginia (0.14)
Asia > China (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hinne, Max, van Gerven, Marcel A. J., Ambrogioni, Luca

Causal inference using Bayesian non-parametric quasi-experimental design

The de facto standard for causal inference is the randomized controlled trial, where one compares an manipulated group with a control group in order to determine the effect of an intervention. However, this research design is not always realistically possible due to pragmatic or ethical concerns. In these situations, quasi-experimental designs may provide a solution, as these allow for causal conclusions at the cost of additional design assumptions. In this paper, we provide a generic framework for quasi-experimental design using Bayesian model comparison, and we show how it can be used as an alternative to several common research designs. We provide a theoretical motivation for a Gaussian process based approach and demonstrate its convenient use in a number of simulations. Finally, we apply the framework to determine the effect of population-based thresholds for municipality funding in France, of the 2005 smoking ban in Sicily on the number of acute coronary events, and of the effect of an alleged historical phantom border in the Netherlands on Dutch voting behaviour.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1911.06722

Country:

Europe > Netherlands (0.34)
Europe > Italy > Sicily (0.24)
Europe > France (0.24)
(7 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Mita, Graziano, Papotti, Paolo, Filippone, Maurizio, Michiardi, Pietro

LIBRE: Learning Interpretable Boolean Rule Ensembles

We present a novel method - LIBRE - to learn an interpretable classifier, which materializes as a set of Boolean rules. LIBRE uses an ensemble of bottom-up weak learners operating on a random subset of features, which allows for the learning of rules that generalize well on unseen data even in imbalanced settings. Weak learners are combined with a simple union so that the final ensemble is also interpretable. Experimental results indicate that LIBRE efficiently strikes the right balance between prediction accuracy, which is competitive with black box methods, and interpretability, which is often superior to alternative methods from the literature.

artificial intelligence, dataset, machine learning, (16 more...)

1911.06537

Country:

North America > United States > Wisconsin (0.04)
Europe > France (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Lőrincz, Szabolcs-Botond, Pável, Szabolcs, Csató, Lehel

Single View Distortion Correction using Semantic Guidance

Most distortion correction methods focus on simple forms of distortion, such as radial or linear distortions. These works undistort images either based on measurements in the presence of a calibration grid, or use multiple views to find point correspondences and predict distortion parameters. When possible distortions are more complex, e.g. in the case of a camera being placed behind a refractive surface such as glass, the standard method is to use a calibration grid. Considering a high variety of distortions, it is nonviable to conduct these measurements. In this work, we present a single view distortion correction method which is capable of undistorting images containing arbitrarily complex distortions by exploiting recent advancements in differentiable image sampling and in the usage of semantic information to augment various tasks. The results of this work show that our model is able to estimate and correct highly complex distortions, and that incorporating semantic information mitigates the process of image undistortion.

artificial intelligence, distortion, machine learning, (14 more...)

doi: 10.1109/IJCNN.2019.8852065

1911.06505

Country: Europe > Romania > Nord-Vest Development Region > Cluj County > Cluj-Napoca (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Kamoi, Ryo, Kobayashi, Kei

Likelihood Assignment for Out-of-Distribution Inputs in Deep Generative Models is Sensitive to Prior Distribution Choice

Recent work has shown that deep generative models assign higher likelihood to out-of-distribution inputs than to training data. We show that a factor underlying this phenomenon is a mismatch between the nature of the prior distribution and that of the data distribution, a problem found in widely used deep generative models such as VAEs and Glow. While a typical choice for a prior distribution is a standard Gaussian distribution, properties of distributions of real data sets may not be consistent with a unimodal prior distribution. This paper focuses on the relationship between the choice of a prior distribution and the likelihoods assigned to out-of-distribution inputs. We propose the use of a mixture distribution as a prior to make likelihoods assigned by deep generative models sensitive to out-of-distribution inputs. Furthermore, we explain the theoretical advantages of adopting a mixture distribution as the prior, and we present experimental results to support our claims. Finally, we demonstrate that a mixture prior lowers the out-of-distribution likelihood with respect to two pairs of real image data sets: Fashion-MNIST vs. MNIST and CIFAR10 vs. SVHN.

artificial intelligence, likelihood, machine learning, (18 more...)

1911.06515

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Journal of Artificial Intelligence ResearchNov-15-2019

Embedding Projection for Targeted Cross-lingual Sentiment: Model Comparisons and a Real-World Study

Barnes, Jeremy (University of Oslo) | Klinger, Roman

Sentiment analysis benefits from large, hand-annotated resources in order to train and test machine learning models, which are often data hungry. While some languages, e.g., English, have a vast arrayof these resources, most under-resourced languages do not, especially for fine-grained sentiment tasks, such as aspect-level or targeted sentiment analysis. To improve this situation, we propose a cross-lingual approach to sentiment analysis that is applicable to under-resourced languages and takes into account target-level information. This model incorporates sentiment information into bilingual distributional representations, byjointly optimizing them for semantics and sentiment, showing state-of-the-art performance at sentence-level when combined with machine translation. The adaptation to targeted sentiment analysis on multiple domains shows that our model outperforms other projection-based bilingual embedding methods on binary targetedsentiment tasks. Our analysis on ten languages demonstrates that the amount of unlabeled monolingual data has surprisingly little effect on the sentiment results. As expected, the choice of a annotated source language for projection to a target leads to better results for source-target language pairs which are similar. Therefore, our results suggest that more efforts should be spent on the creation of resources for less similar languages tothose which are resource-rich already. Finally, a domain mismatch leads to a decreased performance. This suggests resources in any language should ideally cover varieties of domains.

computational linguistic, proceedings, sentiment analysis, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11561

AI Access Foundation

11561

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Greater London > London (0.14)
Europe > Norway > Eastern Norway > Oslo (0.05)
Europe > Norway > Eastern Norway > Akershus (0.04)
(34 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Services (0.68)
Leisure & Entertainment (0.67)
Government > Voting & Elections (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Generating Persona Consistent Dialogues by Exploiting Natural Language Inference

Song, Haoyu, Zhang, Wei-Nan, Hu, Jingwen, Liu, Ting

Consistency is one of the major challenges faced by dialogue agents. A human-like dialogue agent should not only respond naturally, but also maintain a consistent persona. In this paper, we exploit the advantages of natural language inference (NLI) technique to address the issue of generating persona consistent dialogues. Different from existing work that re-ranks the retrieved responses through an NLI model, we cast the task as a reinforcement learning problem and propose to exploit the NLI signals from response-persona pairs as rewards for the process of dialogue generation. Specifically, our generator employs an attention-based encoder-decoder to generate persona-based responses. Our evaluator consists of two components: an adversarially trained naturalness module and an NLI based consistency module. Moreover, we use another well-performed NLI model in the evaluation of persona-consistency. Experimental results on both human and automatic metrics, including the model-based consistency evaluation, demonstrate that the proposed approach outperforms strong generative baselines, especially in the persona-consistency of generated responses.

dialogue generation, generator, proceedings, (14 more...)

1911.05889

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Automated Augmentation with Reinforcement Learning and GANs for Robust Identification of Traffic Signs using Front Camera Images

Chowdhury, Sohini Roy, Tornberg, Lars, Halvfordsson, Robin, Nordh, Jonatan, Gustafsson, Adam Suhren, Wall, Joel, Westerberg, Mattias, Wirehed, Adam, Tilloy, Louis, Hu, Zhanying, Tan, Haoyuan, Pan, Meng, Sjoberg, Jonas

Traffic sign identification using camera images from vehicles plays a critical role in autonomous driving and path planning. However, the front camera images can be distorted due to blurriness, lighting variations and vandalism which can lead to degradation of detection performances. As a solution, machine learning models must be trained with data from multiple domains, and collecting and labeling more data in each new domain is time consuming and expensive. In this work, we present an end-to-end framework to augment traffic sign training data using optimal reinforcement learning policies and a variety of Generative Adversarial Network (GAN) models, that can then be used to train traffic sign detector modules. Our automated augmenter enables learning from transformed nightime, poor lighting, and varying degrees of occlusions using the LISA Traffic Sign and BDD-Nexar dataset. The proposed method enables mapping training data from one domain to another, thereby improving traffic sign detection precision/recall from 0.70/0.66 to 0.83/0.71 for nighttime images.

augmentation, time image, traffic sign, (16 more...)

1911.06486

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)

Genre: Research Report (0.52)

Industry:

Automobiles & Trucks (0.48)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.34)
Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)