AITopics | MacAlpine, Patrick

Collaborating Authors

MacAlpine, Patrick

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL

Mazoure, Bogdan, Ahmed, Ahmed M., MacAlpine, Patrick, Hjelm, R Devon, Kolobov, Andrey

arXiv.org Artificial IntelligenceJun-3-2021

A highly desirable property of a reinforcement learning (RL) agent -- and a major difficulty for deep RL approaches -- is the ability to generalize policies learned on a few tasks over a high-dimensional observation space to similar tasks not seen during training. Many promising approaches to this challenge consider RL as a process of training two functions simultaneously: a complex nonlinear encoder that maps high-dimensional observations to a latent representation space, and a simple linear policy over this space. We posit that a superior encoder for zero-shot generalization in RL can be trained by using solely an auxiliary SSL objective if the training process encourages the encoder to map behaviorally similar observations to similar representations, as reward-based signal can cause overfitting in the encoder (Raileanu et al., 2021). We propose Cross-Trajectory Representation Learning (CTRL), a method that runs within an RL agent and conditions its encoder to recognize behavioral similarity in observations by applying a novel SSL objective to pairs of trajectories from the agent's policies. CTRL can be viewed as having the same effect as inducing a pseudo-bisimulation metric but, crucially, avoids the use of rewards and associated overfitting risks. Our experiments ablate various components of CTRL and demonstrate that in combination with PPO it achieves better generalization performance on the challenging Procgen benchmark suite (Cobbe et al., 2020).

artificial intelligence, reinforcement learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2106.02193

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Measuring Sample Efficiency and Generalization in Reinforcement Learning Benchmarks: NeurIPS 2020 Procgen Benchmark

Mohanty, Sharada, Poonganam, Jyotish, Gaidon, Adrien, Kolobov, Andrey, Wulfe, Blake, Chakraborty, Dipam, Šemetulskis, Gražvydas, Schapke, João, Kubilius, Jonas, Pašukonis, Jurgis, Klimas, Linas, Hausknecht, Matthew, MacAlpine, Patrick, Tran, Quang Nhat, Tumiel, Thomas, Tang, Xiaocheng, Chen, Xinwei, Hesse, Christopher, Hilton, Jacob, Guss, William Hebgen, Genc, Sahika, Schulman, John, Cobbe, Karl

arXiv.org Artificial IntelligenceMar-29-2021

The NeurIPS 2020 Procgen Competition was designed as a centralized benchmark with clearly defined tasks for measuring Sample Efficiency and Generalization in Reinforcement Learning. Generalization remains one of the most fundamental challenges in deep reinforcement learning, and yet we do not have enough benchmarks to measure the progress of the community on Generalization in Reinforcement Learning. We present the design of a centralized benchmark for Reinforcement Learning which can help measure Sample Efficiency and Generalization in Reinforcement Learning by doing end to end evaluation of the training and rollout phases of thousands of user submitted code bases in a scalable way. We designed the benchmark on top of the already existing Procgen Benchmark by defining clear tasks and standardizing the end to end evaluation setups. The design aims to maximize the flexibility available for researchers who wish to design future iterations of such benchmarks, and yet imposes necessary practical constraints to allow for a system like this to scale. This paper presents the competition setup and the details and analysis of the top solutions identified through this setup in context of 2020 iteration of the competition at NeurIPS.

deep learning, neural network, submission, (21 more...)

arXiv.org Artificial Intelligence

2103.15332

Country: South America > Brazil > Rio Grande do Sul (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Multi-Preference Actor Critic

Durugkar, Ishan, Hausknecht, Matthew, Swaminathan, Adith, MacAlpine, Patrick

arXiv.org Artificial IntelligenceApr-5-2019

Policy gradient algorithms typically combine discounted future rewards with an estimated value function, to compute the direction and magnitude of parameter updates. However, for most Reinforcement Learning tasks, humans can provide additional insight to constrain the policy learning. We introduce a general method to incorporate multiple different feedback channels into a single policy gradient loss. In our formulation, the Multi-Preference Actor Critic (M-PAC), these different types of feedback are implemented as constraints on the policy. We use a Lagrangian relaxation to satisfy these constraints using gradient descent while learning a policy that maximizes rewards. Experiments in Atari and Pendulum verify that constraints are being respected and can accelerate the learning process.

artificial intelligence, demonstration, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1904.03295

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (0.48)
Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)

Add feedback

UT Austin Villa 2014: RoboCup 3D Simulation League Champion via Overlapping Layered Learning

MacAlpine, Patrick (University of Texas at Austin) | Depinet, Mike (University of Texas at Austin) | Stone, Peter (University of Texas at Austin)

AAAI ConferencesMar-6-2015

Layered learning is a hierarchical machine learning paradigm that enables learning of complex behaviors by incrementally learning a series of sub-behaviors. A key feature of layered learning is that higher layers directly depend on the learned lower layers. In its original formulation, lower layers were frozen prior to learning higher layers. This paper considers an extension to the paradigm that allows learning certain behaviors independently, and then later stitching them together by learning at the "seams" where their influences overlap. The UT Austin Villa 2014 RoboCup 3D simulation team, using such overlapping layered learning, learned a total of 19 layered behaviors for a simulated soccer-playing robot, organized both in series and in parallel. To the best of our knowledge this is more than three times the number of layered behaviors in any prior layered learning system. Furthermore, the complete learning process is repeated on four different robot body types, showcasing its generality as a paradigm for efficient behavior learning. The resulting team won the RoboCup 2014 championship with an undefeated record, scoring 52 goals and conceding none. This paper includes a detailed experimental analysis of the team's performance and the overlapping layered learning approach that led to its success.

artificial intelligence, learning, soccer, (16 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Florida > Orange County > Orlando (0.14)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Soccer Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SCRAM: Scalable Collision-avoiding Role Assignment with Minimal-Makespan for Formational Positioning

MacAlpine, Patrick (University of Texas at Austin) | Price, Eric (University of Texas at Austin) | Stone, Peter (University of Texas at Austin)

AAAI ConferencesMar-6-2015

Teams of mobile robots often need to divide up subtasks efficiently. In spatial domains, a key criterion for doing so may depend on distances between robots and the subtasks' locations. This paper considers a specific such criterion, namely how to assign interchangeable robots, represented as point masses, to a set of target goal locations within an open two dimensional space such that the makespan (time for all robots to reach their target locations) is minimized while also preventing collisions among robots. We present scaleable (computable in polynomial time) role assignment algorithms that we classify as being SCRAM (Scalable Collision-avoiding Role Assignment with Minimal-makespan). SCRAM role assignment algorithms use a graph theoretic approach to map agents to target goal locations such that our objectives for both minimizing the makespan and avoiding agent collisions are met. A system using SCRAM role assignment was originally designed to allow for decentralized coordination among physically realistic simulated humanoid soccer playing robots in the partially observable, non-deterministic, noisy, dynamic, and limited communication setting of the RoboCup 3D simulation league. In its current form, SCRAM role assignment generalizes well to many realistic and real-world multiagent systems, and scales to thousands of agents.

agent, artificial intelligence, soccer, (16 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas > Travis County > Austin (0.14)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Design and Optimization of an Omnidirectional Humanoid Walk: A Winning Approach at the RoboCup 2011 3D Simulation Competition

MacAlpine, Patrick (University of Texas at Austin) | Barrett, Samuel (University of Texas at Austin) | Urieli, Daniel (University of Texas at Austin) | Vu, Victor (University of Texas at Austin) | Stone, Peter (University of Texas at Austin)

AAAI ConferencesJul-21-2012

This paper presents the design and learning architecture for an omnidirectional walk used by a humanoid robot soccer agent acting in the RoboCup 3D simulation environment. The walk, which was originally designed for and tested on an actual Nao robot before being employed in the 2011 RoboCup 3D simulation competition, was the crucial component in the UT Austin Villa team winning the competition in 2011. To the best of our knowledge, this is the first time that robot behavior has been conceived and constructed on a real robot for the end purpose of being used in simulation. The walk is based on a double linear inverted pendulum model, and multiple sets of its parameters are optimized via a novel framework. The framework optimizes parameters for different tasks in conjunction with one another, a little-understood problem with substantial practical significance. Detailed experiments show that the UT Austin Villa agent significantly outperforms all the other agents in the competition with the optimized walk being the key to its success.

agent, optimization problem, soccer, (17 more...)

AAAI Conferences

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Contests & Prizes (0.34)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Robots > Soccer Robots (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback