AITopics | deep rl agent

Collaborating Authors

deep rl agent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

dbb422937d7ff56e049d61da730b3e11-Paper.pdf

Neural Information Processing SystemsAug-17-2025, 21:20:54 GMT

machine learning, natural language, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.46)

Industry: Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

AdaStop: sequential testing for efficient and reliable comparisons of Deep RL Agents

Mathieu, Timothée, Della Vecchia, Riccardo, Shilova, Alena, de Medeiros, Matheus Centa, Kohler, Hector, Maillard, Odalric-Ambrym, Preux, Philippe

arXiv.org Artificial IntelligenceJun-19-2023

The reproducibility of many experimental results in Deep Reinforcement Learning (RL) is under question. To solve this reproducibility crisis, we propose a theoretically sound methodology to compare multiple Deep RL algorithms. The performance of one execution of a Deep RL algorithm is random so that independent executions are needed to assess it precisely. When comparing several RL algorithms, a major question is how many executions must be made and how can we assure that the results of such a comparison is theoretically sound. Researchers in Deep RL often use less than 5 independent executions to compare algorithms: we claim that this is not enough in general. Moreover, when comparing several algorithms at once, the error of each comparison accumulates and must be taken into account with a multiple tests procedure to preserve low error guarantees. To address this problem in a statistically sound way, we introduce AdaStop, a new statistical test based on multiple group sequential tests. When comparing algorithms, AdaStop adapts the number of executions to stop as early as possible while ensuring that we have enough information to distinguish algorithms that perform better than the others in a statistical significant way. We prove both theoretically and empirically that AdaStop has a low probability of making an error (Family-Wise Error). Finally, we illustrate the effectiveness of AdaStop in multiple use-cases, including toy examples and difficult cases such as Mujoco environments.

adastop, agent, hypothesis, (15 more...)

arXiv.org Artificial Intelligence

2306.10882

Country:

Europe > France > Hauts-de-France > Nord > Lille (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report > Experimental Study (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Evaluating the Robustness of Collaborative Agents

Knott, Paul, Carroll, Micah, Devlin, Sam, Ciosek, Kamil, Hofmann, Katja, Dragan, A. D., Shah, Rohin

arXiv.org Artificial IntelligenceJan-14-2021

In order for agents trained by deep reinforcement learning to work alongside humans in realistic settings, we will need to ensure that the agents are \emph{robust}. Since the real world is very diverse, and human behavior often changes in response to agent deployment, the agent will likely encounter novel situations that have never been seen during training. This results in an evaluation challenge: if we cannot rely on the average training or validation reward as a metric, then how can we effectively evaluate robustness? We take inspiration from the practice of \emph{unit testing} in software engineering. Specifically, we suggest that when designing AI agents that collaborate with humans, designers should search for potential edge cases in \emph{possible partner behavior} and \emph{possible states encountered}, and write tests which check that the behavior of the agent in these edge cases is reasonable. We apply this methodology to build a suite of unit tests for the Overcooked-AI environment, and use this test suite to evaluate three proposals for improving robustness. We find that the test suite provides significant insight into the effects of these proposals that were generally not revealed by looking solely at the average validation reward.

agent, layout, robustness, (15 more...)

arXiv.org Artificial Intelligence

2101.05507

Country: Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.04)

Genre: Research Report > New Finding (0.93)

Industry:

Education (0.68)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.40)

Add feedback

Explainability via Responsibility

Khadivpour, Faraz, Guzdial, Matthew

arXiv.org Artificial IntelligenceOct-4-2020

Procedural Content Generation via Machine Learning (PCGML) refers to a group of methods for creating game content (e.g. platformer levels, game maps, etc.) using machine learning models. PCGML approaches rely on black box models, which can be difficult to understand and debug by human designers who do not have expert knowledge about machine learning. This can be even more tricky in co-creative systems where human designers must interact with AI agents to generate game content. In this paper we present an approach to explainable artificial intelligence in which certain training instances are offered to human users as an explanation for the AI agent's actions during a co-creation process. We evaluate this approach by approximating its ability to provide human users with the explanations of AI agent's actions and helping them to more efficiently cooperate with the AI agent.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2010.01676

Country:

North America > Canada > Alberta (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.88)

Add feedback

Integrating Deep Reinforcement Learning Networks with Health System Simulations

Allen, Michael, Monks, Thomas

arXiv.org Artificial IntelligenceJul-21-2020

Background and motivation: Combining Deep Reinforcement Learning (Deep RL) and Health Systems Simulations has significant potential, for both research into improving Deep RL performance and safety, and in operational practice. While individual toolkits exist for Deep RL and Health Systems Simulations, no framework to integrate the two has been established. Aim: Provide a framework for integrating Deep RL Networks with Health System Simulations, and to ensure this framework is compatible with Deep RL agents that have been developed and tested using OpenAI Gym. Methods: We developed our framework based on the OpenAI Gym framework, and demonstrate its use on a simple hospital bed capacity model. We built the Deep RL agents using PyTorch, and the Hospital Simulatation using SimPy. Results: We demonstrate example models using a Double Deep Q Network or a Duelling Double Deep Q Network as the Deep RL agent. Conclusion: SimPy may be used to create Health System Simulations that are compatible with agents developed and tested on OpenAI Gym environments. GitHub repository of code: https://github.com/MichaelAllen1966/learninghospital

machine learning, reinforcement learning, simulation, (11 more...)

arXiv.org Artificial Intelligence

2008.07434

Genre: Research Report (1.00)

Industry: Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.68)

Add feedback

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

Lin, Yen-Chen, Hong, Zhang-Wei, Liao, Yuan-Hong, Shih, Meng-Li, Liu, Ming-Yu, Sun, Min

arXiv.org Machine LearningNov-12-2019

We introduce two tactics to attack agents trained by deep reinforcement learning algorithms using adversarial examples, namely the strategically-timed attack and the enchanting attack. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an adversarial example should be crafted and applied. In the enchanting attack, the adversary aims at luring the agent to a designated target state. This is achieved by combining a generative model and a planning algorithm: while the generative model predicts the future states, the planning algorithm generates a preferred sequence of actions for luring the agent. A sequence of adversarial examples is then crafted to lure the agent to take the preferred sequence of actions. We apply the two tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. Videos are available at http://yenchenlin.me/adversarial_attack_RL/

adversarial example, agent, deep rl agent, (15 more...)

arXiv.org Machine Learning

1703.06748

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
Asia > Taiwan (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.95)
Leisure & Entertainment > Games > Computer Games (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Visualizing and Understanding Atari Agents

Greydanus, Sam, Koul, Anurag, Dodge, Jonathan, Fern, Alan

arXiv.org Artificial IntelligenceSep-10-2018

While deep reinforcement learning (deep RL) agents are effective at maximizing rewards, it is often unclear what strategies they use to do so. In this paper, we take a step toward explaining deep RL agents through a case study using Atari 2600 environments. In particular, we focus on using saliency maps to understand how an agent learns and executes a policy. We introduce a method for generating useful saliency maps and use it to show 1) what strong agents attend to, 2) whether agents are making decisions for the right or wrong reasons, and 3) how agents evolve during learning. We also test our method on non-expert human subjects and find that it improves their ability to reason about these agents. Overall, our results show that saliency information can provide significant insight into an RL agent's decisions and learning behavior.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

1711.00138

Country: North America > United States > Oregon (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Government > Military (0.46)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparing Reward Shaping, Visual Hints, and Curriculum Learning

Pocius, Rey (Oregon State University) | Isele, David (University of Pennsylvania) | Roberts, Mark (United States Naval Research Laboratory) | Aha, David W. (United States Naval Research Laboratory )

AAAI ConferencesFeb-8-2018

Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum. Yet few studies examine how they compare to each other, when one might prefer one approach, or how they may complement each other. As a first step in this direction, we compare reward shaping, hints, and curricula for a Deep RL agent in the game of Minecraft. We seek to answer whether reward shaping, visual hints, or the curricula have the most impact on performance, which we measure as the time to reach the target, the distance from the target, the cumulative reward, or the number of actions taken. Our analyses show that performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, the results suggest that designing an effective curriculum and providing appropriate hints most improve the performance. Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum, yet few studies examine how they compare to each other. We compare these approaches for a Deep RL agent in the game of Minecraft and show performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, this suggests that designing an effective curriculum with hints most improve the performance.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania (0.15)
North America > United States > Oregon (0.15)

Genre:

Research Report > Experimental Study (0.89)
Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback