AITopics

1905.11623

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Industry:

Information Technology (0.49)
Leisure & Entertainment > Games > Go (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Topin, Nicholay, Veloso, Manuela

Generation of Policy-Level Explanations for Reinforcement Learning

arXiv.org Artificial IntelligenceMay-28-2019

Though reinforcement learning has greatly benefited from the incorporation of neural networks, the inability to verify the correctness of such systems limits their use. Current work in explainable deep learning focuses on explaining only a single decision in terms of input features, making it unsuitable for explaining a sequence of decisions. To address this need, we introduce Abstracted Policy Graphs, which are Markov chains of abstract states. This representation concisely summarizes a policy so that individual decisions can be explained in the context of expected future transitions. Additionally, we propose a method to generate these Abstracted Policy Graphs for deterministic policies given a learned value function and a set of observed transitions, potentially off-policy transitions used during training. Since no restrictions are placed on how the value function is generated, our method is compatible with many existing reinforcement learning methods. We prove that the worst-case time complexity of our method is quadratic in the number of features and linear in the number of provided transitions, $O(|F|^2 |tr\_samples|)$. By applying our method to a family of domains, we show that our method scales well in practice and produces Abstracted Policy Graphs which reliably capture relationships within these domains.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

1905.12044

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

RobohubMay-27-2019, 21:46:44 GMT

Model-based reinforcement learning from pixels with structured latent variable models

Imagine a robot trying to learn how to stack blocks and push objects using visual inputs from a camera feed. In order to minimize cost and safety concerns, we want our robot to learn these skills with minimal interaction time, but efficient learning from complex sensory inputs such as images is difficult. This work introduces SOLAR, a new model-based reinforcement learning (RL) method that can learn skills – including manipulation tasks on a real Sawyer robot arm – directly from visual inputs with under an hour of interaction. To our knowledge, SOLAR is the most efficient RL method for solving real world image-based robotics tasks. Our robot learns to stack a Lego block and push a mug onto a coaster with only inputs from a camera pointed at the robot.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Robohub

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

#artificialintelligenceMay-27-2019, 20:32:07 GMT

Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar

Automatic machine learning is an important problem in the forefront of machine learning. The strongest AutoML systems are based on neural networks, evolutionary algorithms, and Bayesian optimization. Recently AlphaD3M reached state-of-the-art results with an order of magnitude speedup using reinforcement learning with self-play. In this work we extend AlphaD3M by using a pipeline grammar and a pre-trained model which generalizes from many different datasets and similar tasks. Our results demonstrate improved performance compared with our earlier work and existing methods on AutoML benchmark datasets for classification and regression tasks.

artificial intelligence, automatic machine learning, model-based reinforcement learning, (3 more...)

#artificialintelligence

Genre: Research Report > New Finding (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Brunton, Steven, Noack, Bernd, Koumoutsakos, Petros

Machine Learning for Fluid Mechanics

arXiv.org Machine LearningMay-27-2019

The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from experiments, field measurements, and large-scale simulations at multiple spatiotemporal scales. Machine learning presents us with a wealth of techniques to extract information from data that can be translated into knowledge about the underlying fluid mechanics. Moreover, machine learning algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of machine learning for fluid mechanics. We outline fundamental machine learning methodologies and discuss their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that links data with modeling, experiments, and simulations. Machine learning provides a powerful information processing framework that can augment, and possibly even transform, current lines of fluid mechanics research and industrial applications.

air transportation, deep learning, upstream oil & gas, (20 more...)

1905.11075

Country:

Europe > Switzerland (0.14)
Europe > United Kingdom (0.14)
North America > United States > Washington > King County > Seattle (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Energy > Oil & Gas > Upstream (1.00)
Transportation > Air (0.92)
Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

arXiv.org Machine LearningMay-27-2019

Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy

Yang, Ruihan, Ye, Qiwei, Liu, Tie-Yan

A fundamental issue in reinforcement learning algorithms is the balance between exploration of the environment and exploitation of information already obtained by the agent. Especially, exploration has played a critical role for both efficiency and efficacy of the learning process. However, Existing works for exploration involve task-agnostic design, that is performing well in one environment, but be ill-suited to another. To the purpose of learning an effective and efficient exploration policy in an automated manner. We formalized a feasible metric for measuring the utility of exploration based on counterfactual ideology. Based on that, We proposed an end-to-end algorithm to learn exploration policy by meta-learning. We demonstrate that our method achieves good results compared to previous works in the high-dimensional control tasks in MuJoCo simulator.

artificial intelligence, exploration, upstream oil & gas, (16 more...)

1905.11583

Genre: Research Report (0.64)

Industry:

Energy > Oil & Gas > Upstream (0.74)
Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceMay-27-2019

Actor-Attention-Critic for Multi-Agent Reinforcement Learning

Iqbal, Shariq, Sha, Fei

Reinforcement learning in multi-agent scenarios is important for real-world applications but presents challenges beyond those seen in single-agent settings. We present an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep. This attention mechanism enables more effective and scalable learning in complex multi-agent environments, when compared to recent approaches. Our approach is applicable not only to cooperative settings with shared rewards, but also individualized reward settings, including adversarial settings, as well as settings that do not provide global states, and it makes no assumptions about the action spaces of the agents. As such, it is flexible enough to be applied to most multi-agent learning problems.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

1810.02912

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningMay-27-2019

Additive Adversarial Learning for Unbiased Authentication

Liang, Jian, Cao, Yuren, Zhang, Chenbin, Chang, Shiyu, Bai, Kun, Xu, Zenglin

Authentication is a task aiming to confirm the truth between data instances and personal identities. Typical authentication applications include face recognition, person re-identification, authentication based on mobile devices and so on. The recently-emerging data-driven authentication process may encounter undesired biases, i.e., the models are often trained in one domain (e.g., for people wearing spring outfits) while required to apply in other domains (e.g., they change the clothes to summer outfits). To address this issue, we propose a novel two-stage method that disentangles the class/identity from domain-differences, and we consider multiple types of domain-difference. In the first stage, we learn disentangled representations by a one-versus-rest disentangle learning (OVRDL) mechanism. In the second stage, we improve the disentanglement by an additive adversarial learning (AAL) mechanism. Moreover, we discuss the necessity to avoid a learning dilemma due to disentangling causally related types of domain-difference. Comprehensive evaluation results demonstrate the effectiveness and superiority of the proposed method.

machine learning, mechanism, reinforcement learning, (16 more...)

1905.06517

Country:

Asia > China (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Overview (0.68)
Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

arXiv.org Artificial IntelligenceMay-27-2019

AgentGraph: Towards Universal Dialogue Management with Structured Deep Reinforcement Learning

Chen, Lu, Chen, Zhi, Tan, Bowen, Long, Sishan, Gasic, Milica, Yu, Kai

Dialogue policy plays an important role in task-oriented spoken dialogue systems. It determines how to respond to users. The recently proposed deep reinforcement learning (DRL) approaches have been used for policy optimization. However, these deep models are still challenging for two reasons: 1) Many DRL-based policies are not sample-efficient. 2) Most models don't have the capability of policy transfer between different domains. In this paper, we propose a universal framework, AgentGraph, to tackle these two problems. The proposed AgentGraph is the combination of GNN-based architecture and DRL-based algorithm. It can be regarded as one of the multi-agent reinforcement learning approaches. Each agent corresponds to a node in a graph, which is defined according to the dialogue domain ontology. When making a decision, each agent can communicate with its neighbors on the graph. Under AgentGraph framework, we further propose Dual GNN-based dialogue policy, which implicitly decomposes the decision in each turn into a high-level global decision and a low-level local decision. Experiments show that AgentGraph models significantly outperform traditional reinforcement learning approaches on most of the 18 tasks of the PyDial benchmark. Moreover, when transferred from the source task to a target task, these models not only have acceptable initial performance but also converge much faster on the target task.

dialogue policy, machine learning, reinforcement learning, (14 more...)

1905.11259

Country:

Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > China > Shanghai > Shanghai (0.05)
Europe > Serbia > Central Serbia > Belgrade (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Efroni, Yonathan, Merlis, Nadav, Ghavamzadeh, Mohammad, Mannor, Shie

Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies

arXiv.org Artificial IntelligenceMay-27-2019

State-of-the-art efficient model-based Reinforcement Learning (RL) algorithms typically act by iteratively solving empirical models, i.e., by performing \emph{full-planning} on Markov Decision Processes (MDPs) built by the gathered experience. In this paper, we focus on model-based RL in the finite-state finite-horizon MDP setting and establish that exploring with \emph{greedy policies} -- act by \emph{1-step planning} -- can achieve tight minimax performance in terms of regret, $\tilde{\mathcal{O}}(\sqrt{HSAT})$. Thus, full-planning in model-based RL can be avoided altogether without any performance degradation, and, by doing so, the computational complexity decreases by a factor of $S$. The results are based on a novel analysis of real-time dynamic programming, then extended to model-based RL. Specifically, we generalize existing algorithms that perform full-planning to such that act by 1-step planning. For these generalizations, we prove regret bounds with the same rate as their full-planning counterparts.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1905.11527

Country:

Asia > Middle East > Israel (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)