AITopics | Luke, Sean

This paper proposes a novel methodology for addressing the simulation-reality gap for multi-robot swarm systems. Rather than immediately try to shrink or `bridge the gap' anytime a real-world experiment failed that worked in simulation, we characterize conditions under which this is actually necessary. When these conditions are not satisfied, we show how very simple simulators can still be used to both (i) design new multi-robot systems, and (ii) guide real-world swarming experiments towards certain emergent behaviors when the gap is very large. The key ideas are an iterative simulator-in-the-design-loop in which real-world experiments, simulator modifications, and simulated experiments are intimately coupled in a way that minds the gap without needing to shrink it, as well as the use of minimally viable phase diagrams to guide real world experiments. We demonstrate the usefulness of our methods on deploying a real multi-robot swarm system to successfully exhibit an emergent milling behavior.

artificial intelligence, evolutionary algorithm, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2301.09018

Country: North America > United States (0.28)

Genre:

Research Report (0.50)
Workflow (0.48)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.94)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.46)

Add feedback

Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space

Wei, Ermo, Wicke, Drew, Luke, Sean

arXiv.org Artificial IntelligenceOct-23-2018

We explore Deep Reinforcement Learning in a parameterized action space. Specifically, we investigate how to achieve sample-efficient end-to-end training in these tasks. We propose a new compact architecture for the tasks where the parameter policy is conditioned on the output of the discrete action policy. We also propose two new methods based on the state-of-the-art algorithms Trust Region Policy Optimization (TRPO) and Stochastic Value Gradient (SVG) to train such an architecture. We demonstrate that these methods outperform the state of the art method, Parameterized Action DDPG, on test domains.

algorithm, computer game, neural network, (18 more...)

arXiv.org Artificial Intelligence

1810.09656

Country: North America > United States (0.14)

Genre: Research Report (0.70)

Industry:

Leisure & Entertainment > Games (0.68)
Leisure & Entertainment > Sports (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multiagent Soft Q-Learning

Wei, Ermo, Wicke, Drew, Freelan, David, Luke, Sean

arXiv.org Artificial IntelligenceApr-25-2018

Policy gradient methods are often applied to reinforcement learning in continuous multiagent games. These methods perform local search in the joint-action space, and as we show, they are susceptable to a game-theoretic pathology known as relative overgeneralization. To resolve this issue, we propose Multiagent Soft Q-learning, which can be seen as the analogue of applying Q-learning to continuous controls. We compare our method to MADDPG, a state-of-the-art approach, and show that our method achieves better coordination in multiagent cooperative tasks, converging to better local optima in the joint action space.

agent, artificial intelligence, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

1804.09817

Country: North America > United States (0.46)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

LfD Training of Heterogeneous Formation Behaviors

Squires, William G. (George Mason University) | Luke, Sean (George Mason University)

AAAI ConferencesMar-21-2018

Problem domains such as disaster relief, search and rescue, and games can benefit from having a human quickly train coordinated behaviors for a diverse set of agents. Hierarchical Training of Agent Behaviors (HiTAB) is a Learning from Demonstration (LfD) approach that addresses some inherent complexities in multiagent learning, making it possible to train complex heterogeneous behaviors from a small set of training samples. In this paper, we successfully demonstrate LfD training of formation behaviors using a small set of agents that, without retraining, continue to operate correctly when additional agents are available. We selected training of formations for the experiments because formations: require a great deal of coordination between agents, are heterogenous due to the differing roles of participating agents, and can scale as the number of agents grows. We also introduce some extensions to HiTAB that facilitate this type of training.

heterogeneous formation behavior, lfd training

AAAI Conferences

2018 AAAI Spring Symposium Series

Industry: Energy > Oil & Gas > Upstream (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.73)

Add feedback

Multiagent Soft Q-Learning

Wei, Ermo (George Mason University) | Wicke, Drew (George Mason University) | Freelan, David (George Mason University) | Luke, Sean (George Mason University)

AAAI ConferencesMar-21-2018

Policy gradient methods are often applied to reinforcement learning in continuous multiagent games. These methods perform local search in the joint-action space, and as we show, they are susceptable to a game-theoretic pathology known as relative overgeneralization. To resolve this issue, we propose Multiagent Soft Q-learning, which can be seen as the analogue of applying Q-learning to continuous controls. We compare our method to MADDPG, a state-of-the-art approach, and show that our method achieves better coordination in multiagent cooperative tasks, converging to better local optima in the joint action space.

multiagent soft q-learning

AAAI Conferences

2018 AAAI Spring Symposium Series

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space

Wei, Ermo (George Mason University) | Wicke, Drew (George Mason University) | Luke, Sean (George Mason University)

AAAI ConferencesMar-21-2018

We explore Deep Reinforcement Learning in a parameterized action space. Specifically, we investigate how to achieve sample-efficient end-to-end training in these tasks. We propose a new compact architecture for the tasks where the parameter policy is conditioned on the output of the discrete action policy. We also propose two new methods based on the state-of-the-art algorithms Trust Region Policy Optimization (TRPO) and Stochastic Value Gradient (SVG) to train such an architecture. We demonstrate that these methods outperform the state of the art method, Parameterized Action DDPG, on test domains.

hierarchical approach, parameterized action space, reinforcement learning

AAAI Conferences

2018 AAAI Spring Symposium Series

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback

Bounty Hunting and Human-Agent Group Task Allocation

Wicke, Drew (George Mason University) | Luke, Sean (George Mason University)

AAAI ConferencesOct-31-2017

Much research has been done to apply auctions, markets, and negotiation mechanisms to solve the multiagent task allocation problem. However, there has been very little work on human-agent group task allocation. We believe that the notion of bounty hunting has good properties for human-agent group interaction in dynamic task allocation problems. We use previous experimental results comparing bounty hunting with auction-like methods to argue why it would be particularly adept at handling scenarios with unreliable collaborators and unexpectedly hard tasks: scenarios we believe highlight difficulties involved in working with humans collaborators.

bounty hunting, human-agent group task allocation

AAAI Conferences

2017 AAAI Fall Symposium Series

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.80)

Add feedback

Reports on the 2004 AAAI Fall Symposia

Cassimatis, Nick, Luke, Sean, Levy, Simon D., Gayler, Ross, Kanerva, Pentti, Eliasmith, Chris, Bickmore, Timothy, Schultz, Alan C., Davis, Randall, Landay, James, Miller, Rob, Saund, Eric, Stahovich, Tom, Littman, Michael, Singh, Satinder, Argamon, Shlomo, Dubnov, Shlomo

AI MagazineMar-15-2005

The Association for the Advancement of Artificial Intelligence presented its 2004 Fall Symposium Series Friday through Sunday, October 22-24 at the Hyatt Regency Crystal City in Arlington, Virginia, adjacent to Washington, DC. The symposium series was preceded by a one-day AI funding seminar. The topics of the eight symposia in the 2004 Fall Symposia Series were: (1) Achieving Human-Level Intelligence through Integrated Systems and Research; (2) Artificial Multiagent Learning; (3) Compositional Connectionism in Cognitive Science; (4) Dialogue Systems for Health Communications; (5) The Intersection of Cognitive Science and Robotics: From Interfaces to Intelligence; (6) Making Pen-Based Interaction Intelligent and Natural; (7) Real- Life Reinforcement Learning; and (8) Style and Meaning in Language, Art, Music, and Design.

artificial intelligence, management and information, symposia, (3 more...)

AI Magazine

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Reports on the 2004 AAAI Fall Symposia

Cassimatis, Nick, Luke, Sean, Levy, Simon D., Gayler, Ross, Kanerva, Pentti, Eliasmith, Chris, Bickmore, Timothy, Schultz, Alan C., Davis, Randall, Landay, James, Miller, Rob, Saund, Eric, Stahovich, Tom, Littman, Michael, Singh, Satinder, Argamon, Shlomo, Dubnov, Shlomo

AI MagazineMar-15-2005

Learning) are also available as AAAI be integrated and (2) architectures Technical Reports. There through Sunday, October 22-24 at an opportunity for new and junior researchers--as was consensus among participants the Hyatt Regency Crystal City in Arlington, well as students and that metrics in machine learning, Virginia, adjacent to Washington, postdoctoral fellows--to get an inside planning, and natural language processing DC. The symposium series was look at what funding agencies expect have driven advances in those preceded on Thursday, October 21 by in proposals from prospective subfields, but that those metrics have a one-day AI funding seminar, which grantees. Representatives and program also distracted attention from how to was open to all registered attendees. The topic is of increasing interest Domains for motivating, testing, large numbers of agents, more complex with the advent of peer-to-peer network and funding this research were agent behaviors, partially observable services and with ad-hoc wireless proposed (some during our joint session environments, and mutual adaptation.

neural network, symposium, us government, (20 more...)

AI Magazine

Country:

North America > United States > Virginia (0.35)
North America > United States > California (0.29)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.68)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Three RoboCup Simulation League Commentator Systems

Andre, Elisabeth, Binsted, Kim, Tanaka-Ishii, Kumiko, Luke, Sean, Herzog, Gerd, Rist, Thomas

AI MagazineMar-15-2000

Three systems that generate real-time natural language commentary on the RoboCup simulation league are presented, and their similarities, differences, and directions for the future discussed. Although they emphasize different aspects of the commentary problem, all three systems take simulator data as input and generate appropriate, expressive, spoken commentary in real time.

artificial intelligence, RoboCup simulation league, soccer, (3 more...)

AI Magazine

Industry: Leisure & Entertainment > Sports > Soccer (0.79)

Technology: Information Technology > Artificial Intelligence > Robots > Soccer Robots (0.79)

Add feedback

Filters

Collaborating Authors

Luke, Sean

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Simulate Less, Expect More: Bringing Robot Swarms to Life via Low-Fidelity Simulations

Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space

Multiagent Soft Q-Learning

LfD Training of Heterogeneous Formation Behaviors

Multiagent Soft Q-Learning

Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space

Bounty Hunting and Human-Agent Group Task Allocation

Reports on the 2004 AAAI Fall Symposia

Reports on the 2004 AAAI Fall Symposia

Three RoboCup Simulation League Commentator Systems