AITopics | Gombolay, Matthew

Collaborating Authors

Gombolay, Matthew

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Interpretable Policy Specification and Synthesis through Natural Language and RL

Tambwekar, Pradyumna, Silva, Andrew, Gopalan, Nakul, Gombolay, Matthew

arXiv.org Artificial IntelligenceJan-18-2021

Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.

air transportation, decision tree learning, deep learning, (17 more...)

arXiv.org Artificial Intelligence

2101.0714

Country:

North America > United States > Massachusetts (0.14)
North America > United States > Pennsylvania (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Air (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

FireCommander: An Interactive, Probabilistic Multi-agent Environment for Joint Perception-Action Tasks

Seraj, Esmaeil, Wu, Xiyang, Gombolay, Matthew

arXiv.org Artificial IntelligenceOct-30-2020

The purpose of this tutorial is to help individuals use the \underline{FireCommander} game environment for research applications. The FireCommander is an interactive, probabilistic joint perception-action reconnaissance environment in which a composite team of agents (e.g., robots) cooperate to fight dynamic, propagating firespots (e.g., targets). In FireCommander game, a team of agents must be tasked to optimally deal with a wildfire situation in an environment with propagating fire areas and some facilities such as houses, hospitals, power stations, etc. The team of agents can accomplish their mission by first sensing (e.g., estimating fire states), communicating the sensed fire-information among each other and then taking action to put the firespots out based on the sensed information (e.g., dropping water on estimated fire locations). The FireCommander environment can be useful for research topics spanning a wide range of applications from Reinforcement Learning (RL) and Learning from Demonstration (LfD), to Coordination, Psychology, Human-Robot Interaction (HRI) and Teaming. There are four important facets of the FireCommander environment that overall, create a non-trivial game: (1) Complex Objectives: Multi-objective Stochastic Environment, (2)Probabilistic Environment: Agents' actions result in probabilistic performance, (3) Hidden Targets: Partially Observable Environment and, (4) Uni-task Robots: Perception-only and Action-only agents. The FireCommander environment is first-of-its-kind in terms of including Perception-only and Action-only agents for coordination. It is a general multi-purpose game that can be useful in a variety of combinatorial optimization problems and stochastic games, such as applications of Reinforcement Learning (RL), Learning from Demonstration (LfD) and Inverse RL (iRL).

agent, computer game, soccer, (17 more...)

arXiv.org Artificial Intelligence

2011.00165

Country: North America > United States > Massachusetts (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.66)

Industry:

Health & Medicine (1.00)
Energy (1.00)
Leisure & Entertainment > Games > Computer Games (0.66)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Human-Robot Team Coordination with Dynamic and Latent Human Task Proficiencies: Scheduling with Learning Curves

Liu, Ruisen, Natarajan, Manisha, Gombolay, Matthew

arXiv.org Artificial IntelligenceJul-8-2020

As robots become ubiquitous in the workforce, it is essential that human-robot collaboration be both intuitive and adaptive. A robot's quality improves based on its ability to explicitly reason about the time-varying (i.e. learning curves) and stochastic capabilities of its human counterparts, and adjust the joint workload to improve efficiency while factoring human preferences. We introduce a novel resource coordination algorithm that enables robots to explore the relative strengths and learning abilities of their human teammates, by constructing schedules that are robust to stochastic and time-varying human task performance. We first validate our algorithmic approach using data we collected from a user study (n = 20), showing we can quickly generate and evaluate a robust schedule while discovering the latest individual worker proficiency. Second, we conduct a between-subjects experiment (n = 90) to validate the efficacy of our coordinating algorithm. Results from the human-subjects experiment indicate that scheduling strategies favoring exploration tend to be beneficial for human-robot collaboration as it improves team fluency (p = 0.0438), while also maximizing team efficiency (p < 0.001).

artificial intelligence, constraint-based reasoning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2007.01921

Genre: Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Games (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.84)

Add feedback

Personalized Apprenticeship Learning from Heterogeneous Decision-Makers

Paleja, Rohan, Silva, Andrew, Gombolay, Matthew

arXiv.org Artificial IntelligenceJun-14-2019

Human domain experts solve difficult planning problems by drawing on years of experience. In many cases, computing a solution to such problems is computationally intractable or requires encoding heuristics from human domain experts. As codifying this knowledge leaves much to be desired, we aim to infer their strategies through observation. The challenge lies in that humans exhibit heterogeneity in their latent decision-making criteria. To overcome this, we propose a personalized apprenticeship learning framework that automatically infers a representation of all human task demonstrators by extracting a human-specific embedding. Our framework is built on a propositional architecture that allows for distilling an interpretable representation of each human demonstrator's decision-making.

demonstrator, neural network, planning & scheduling, (19 more...)

arXiv.org Artificial Intelligence

1906.06397

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

Interpretable Reinforcement Learning via Differentiable Decision Trees

Rodriguez, Ivan Dario Jimenez, Killian, Taylor, Son, Sung-Hyun, Gombolay, Matthew

arXiv.org Machine LearningMar-21-2019

Decision trees are ubiquitous in machine learning for their ease of use and interpretability; however, they are not typically implemented in reinforcement learning because they cannot be updated via stochastic gradient descent. Traditional applications of decision trees for reinforcement learning have focused instead on making commitments to decision boundaries as the tree is grown one layer at a time. We overcome this critical limitation by allowing for a gradient update over the entire tree structure that improves sample complexity when a tree is fuzzy and interpretability when sharp. We offer three key contributions towards this goal. First, we motivate the need for policy gradient-based learning by examining the theoretical properties of gradient descent over differentiable decision trees. Second, we introduce a regularization framework that yields interpretability via sparsity in the tree structure. Third, we demonstrate the ability to construct a decision tree via policy gradient in canonical reinforcement learning domains and supervised learning benchmarks.

artificial intelligence, decision tree learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

1903.09338

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Add feedback

Safe Coordination of Human-Robot Firefighting Teams

Seraj, Esmaeil, Silva, Andrew, Gombolay, Matthew

arXiv.org Artificial IntelligenceMar-15-2019

Wildfires are destructive and inflict massive, irreversible harm to victims' lives and natural resources. Researchers have proposed commissioning unmanned aerial vehicles (UAVs) to provide firefighters with real-time tracking information; yet, these UAVs are not able to reason about a fire's track, including current location, measurement, and uncertainty, as well as propagation. We propose a model-predictive, probabilistically safe distributed control algorithm for human-robot collaboration in wildfire fighting. The proposed algorithm overcomes the limitations of prior work by explicitly estimating the latent fire propagation dynamics to enable intelligent, time-extended coordination of the UAVs in support of on-the-ground human firefighters. We derive a novel, analytical bound that enables UAVs to distribute their resources and provides a probabilistic guarantee of the humans' safety while preserving the UAVs' ability to cover an entire fire.

drone, law enforcement, us government, (19 more...)

arXiv.org Artificial Intelligence

1903.06847

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Government > Regional Government > North America Government > United States Government (0.93)
Law Enforcement & Public Safety > Fire & Emergency Services (0.92)
Food & Agriculture (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.66)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.61)

Add feedback

Inferring Personalized Bayesian Embeddings for Learning from Heterogeneous Demonstration

Paleja, Rohan, Gombolay, Matthew

arXiv.org Artificial IntelligenceMar-14-2019

For assistive robots and virtual agents to achieve ubiquity, machines will need to anticipate the needs of their human counterparts. The field of Learning from Demonstration (LfD) has sought to enable machines to infer predictive models of human behavior for autonomous robot control. However, humans exhibit heterogeneity in decision-making, which traditional LfD approaches fail to capture. To overcome this challenge, we propose a Bayesian LfD framework to infer an integrated representation of all human task demonstrators by inferring human-specific embeddings, thereby distilling their unique characteristics. We validate our approach is able to outperform state-of-the-art techniques on both synthetic and real-world data sets.

counterfactual reasoning, deep learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

1903.06047

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

ProLoNets: Neural-encoding Human Experts' Domain Knowledge to Warm Start Reinforcement Learning

Silva, Andrew, Gombolay, Matthew

arXiv.org Machine LearningFeb-15-2019

Deep reinforcement learning has seen great success across a breadth of tasks such as in game playing and robotic manipulation. However, the modern practice of attempting to learn tabula rasa disregards the logical structure of many domains and the wealth of readily-available human domain experts' knowledge that could help ``warm start'' the learning process. Further, learning from demonstration techniques are not yet sufficient to infer this knowledge through sampling-based mechanisms in large state and action spaces, or require immense amounts of data. We present a new reinforcement learning architecture that can encode expert knowledge, in the form of propositional logic, directly into a neural, tree-like structure of fuzzy propositions that are amenable to gradient descent. We show that our novel architecture is able to outperform reinforcement and imitation learning techniques across an array of canonical challenge problems for artificial intelligence.

agent, computer game, deep learning, (18 more...)

arXiv.org Machine Learning

1902.06007

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Human-Machine Collaborative Optimization via Apprenticeship Scheduling

Gombolay, Matthew, Jensen, Reed, Stigile, Jessica, Golen, Toni, Shah, Neel, Son, Sung-Hyun, Shah, Julie

Journal of Artificial Intelligence ResearchSep-17-2018

Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the "single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes. We propose a new approach for capturing this decision-making process through counterfactual reasoning in pairwise comparisons. Our approach is model-free and does not require iterating through the state space. We demonstrate that this approach accurately learns multifaceted heuristics on a synthetic and real world data sets. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of schedule optimization. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates optimal solutions up to 9.5 times faster than a state-of-the-art optimization algorithm.

demonstration, machine learning, reinforcement learning, (20 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11233

AI Access Foundation

11233

Journal of Artificial Intelligence Research

Country:

Europe (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)
Overview (0.92)

Industry:

Transportation > Air (1.00)
Leisure & Entertainment > Games (1.00)
Information Technology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
(3 more...)

Add feedback

Human-Machine Collaborative Optimization via Apprenticeship Scheduling

Gombolay, Matthew, Jensen, Reed, Stigile, Jessica, Golen, Toni, Shah, Neel, Son, Sung-Hyun, Shah, Julie

arXiv.org Artificial IntelligenceMay-10-2018

Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator.

air transportation, constraint-based reasoning, demonstration, (22 more...)

arXiv.org Artificial Intelligence

1805.0422

Country:

Europe (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)
Overview (0.92)

Industry:

Transportation > Air (1.00)
Leisure & Entertainment > Games (1.00)
Information Technology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(4 more...)

Add feedback