AITopics

2012.07228

Country:

Asia > China > Anhui Province > Hefei (0.06)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Zhang, Qi, Durfee, Edmund H., Singh, Satinder

Efficient Querying for Cooperative Probabilistic Commitments

arXiv.org Artificial IntelligenceDec-13-2020

Multiagent systems can use commitments as the core of a general coordination infrastructure, supporting both cooperative and non-cooperative interactions. Agents whose objectives are aligned, and where one agent can help another achieve greater reward by sacrificing some of its own reward, should choose a cooperative commitment to maximize their joint reward. We present a solution to the problem of how cooperative agents can efficiently find an (approximately) optimal commitment by querying about carefully-selected commitment choices. We prove structural properties of the agents' values as functions of the parameters of the commitment specification, and develop a greedy method for composing a query with provable approximation bounds, which we empirically show can find nearly optimal commitments in a fraction of the time methods that lack our insights require.

agent, provider, query, (17 more...)

2012.07195

Country:

Europe > Slovenia > Central Slovenia > Municipality of Komenda > Komenda (0.04)
North America > United States > South Carolina (0.04)
North America > United States > Michigan (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.54)

A Unified Model for the Two-stage Offline-then-Online Resource Allocation

Xu, Yifan, Xu, Pan, Pan, Jianping, Tao, Jun

Furthermore, upon the arrival of any online agent, we have to decide quickly and irrevocably which offline agent(s) to With the popularity of the Internet, traditional offline match it with. That is mainly due to the low "patience" of resource allocation has evolved into a new the online agents. These features--online arrivals and the form, called online resource allocation. It features real-time decision-making requirement--distinguish OMMs the online arrivals of agents in the system and the from traditional matching markets where the information of real-time decision-making requirement upon the arrival all agents is fully disclosed in advance. of each online agent. Both offline and online OMMs have received significant interest in both computer resource allocation have wide applications in science and operations research communities. There is a various real-world matching markets ranging from large body of research work who studied matching policy ridesharing to crowdsourcing. There are some design for the profit maximization in ridesharing [Ashlagi emerging applications such as rebalancing in bike et al., 2019; Lowalekar et al., 2018; Bei and Zhang, 2018; sharing and trip-vehicle dispatching in ridesharing, Zhao et al., 2019; Dickerson et al., 2018a; Li et al., 2020], which involve a two-stage resource allocation process.

assignment, non-integral resource, phase ii, (15 more...)

doi: 10.24963/ijcai.2020/581

2012.06845

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)
North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.04)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.74)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
Information Technology > Communications > Social Media > Crowdsourcing (0.35)

Learning Multi-Arm Manipulation Through Collaborative Teleoperation

Tung, Albert, Wong, Josiah, Mandlekar, Ajay, Martín-Martín, Roberto, Zhu, Yuke, Fei-Fei, Li, Savarese, Silvio

Imitation Learning (IL) is a powerful paradigm to teach robots to perform manipulation tasks by allowing them to learn from human demonstrations collected via teleoperation, but has mostly been limited to single-arm manipulation. However, many real-world tasks require multiple arms, such as lifting a heavy object or assembling a desk. Unfortunately, applying IL to multi-arm manipulation tasks has been challenging -- asking a human to control more than one robotic arm can impose significant cognitive burden and is often only possible for a maximum of two robot arms. To address these challenges, we present Multi-Arm RoboTurk (MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms and collect demonstrations for multi-arm tasks. Using MART, we collected demonstrations for five novel two and three-arm tasks from several geographically separated users. From our data we arrived at a critical insight: most multi-arm tasks do not require global coordination throughout its full duration, but only during specific moments. We show that learning from such data consequently presents challenges for centralized agents that directly attempt to model all robot actions simultaneously, and perform a comprehensive study of different policy architectures with varying levels of centralization on our tasks. Finally, we propose and evaluate a base-residual policy framework that allows trained policies to better adapt to the mixed coordination setting common in multi-arm manipulation, and show that a centralized policy augmented with a decentralized residual model outperforms all other models on our set of benchmark tasks. Additional results and videos at https://roboturk.stanford.edu/multiarm .

coordination, demonstration, multi-arm task, (13 more...)

2012.06738

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.24)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Human-in-the-Loop Imitation Learning using Remote Teleoperation

Mandlekar, Ajay, Xu, Danfei, Martín-Martín, Roberto, Zhu, Yuke, Fei-Fei, Li, Savarese, Silvio

Imitation Learning is a promising paradigm for learning complex robot manipulation skills by reproducing behavior from human demonstrations. However, manipulation tasks often contain bottleneck regions that require a sequence of precise actions to make meaningful progress, such as a robot inserting a pod into a coffee machine to make coffee. Trained policies can fail in these regions because small deviations in actions can lead the policy into states not covered by the demonstrations. Intervention-based policy learning is an alternative that can address this issue -- it allows human operators to monitor trained policies and take over control when they encounter failures. In this paper, we build a data collection system tailored to 6-DoF manipulation settings, that enables remote human operators to monitor and intervene on trained policies. We develop a simple and effective algorithm to train the policy iteratively on new data collected by the system that encourages the policy to learn how to traverse bottlenecks through the interventions. We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators, and further show that our method outperforms multiple state-of-the-art baselines for learning from the human interventions on a challenging robot threading task and a coffee making task. Additional results and videos at https://sites.google.com/stanford.edu/iwr .

dataset, intervention, learning, (14 more...)

2012.06733

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.24)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Hazra, Rishi, Dixit, Sonu, Sen, Sayambhu

Infinite use of finite means: Zero-Shot Generalization using Compositional Emergent Protocols

Human language has been described as a system that makes use of finite means to express an unlimited array of thoughts. Of particular interest is the aspect of compositionality, whereby, the meaning of a complex, compound language expression can be deduced from the meaning of its constituent parts. If artificial agents can develop compositional communication protocols akin to human language, they can be made to seamlessly generalize to unseen combinations. However, the real question is, how do we induce compositionality in emergent communication? Studies have recognized the role of curiosity in enabling linguistic development in children. It is this same intrinsic urge that drives us to master complex tasks with decreasing amounts of explicit reward. In this paper, we seek to use this intrinsic feedback in inducing a systematic and unambiguous protolanguage in artificial agents. We show in our experiments, how these rewards can be leveraged in training agents to induce compositionality in absence of any external feedback. Additionally, we introduce Comm-gSCAN, a platform for investigating grounded language acquisition in 2D-grid environments. Using this, we demonstrate how compositionality can enable agents to not only interact with unseen objects, but also transfer skills from one task to other in zero-shot (Can an agent, trained to pull and push twice, pull twice?)

communication, compositionality, listener, (14 more...)

2012.05011

Country:

North America > United States > New York (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)

Naumov, Pavel, Ros, Kevin

Comprehension and Knowledge

arXiv.org Artificial IntelligenceDec-11-2020

The ability of an agent to comprehend a sentence is tightly connected to the agent's prior experiences and background knowledge. The paper suggests to interpret comprehension as a modality and proposes a complete bimodal logical system that describes an interplay between comprehension and knowledge modalities.

artificial intelligence, logic & formal reasoning, natural language, (17 more...)

2012.06561

Country:

South America > Colombia > Bogotá D.C. > Bogotá (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois (0.04)
(3 more...)

Genre: Research Report (0.70)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.70)

Epstein, Sophia, Naumov, Pavel

Epistemic Logic of Know-Who

arXiv.org Artificial IntelligenceDec-11-2020

The paper suggests a definition of "know who" as a modality using Grove-Halpern semantics of names. It also introduces a logical system that describes the interplay between modalities "knows who", "knows", and "for all agents". The main technical result is a completeness theorem for the proposed system.

agent, definition 2, inference rule, (15 more...)

2012.06651

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > Hawaii (0.04)
(4 more...)

Genre: Research Report (0.70)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Mohanty, Sharada, Nygren, Erik, Laurent, Florian, Schneider, Manuel, Scheller, Christian, Bhattacharya, Nilabha, Watson, Jeremy, Egli, Adrian, Eichenberger, Christian, Baumberger, Christian, Vienken, Gereon, Sturm, Irene, Sartoretti, Guillaume, Spigler, Giacomo

Flatland-RL : Multi-Agent Reinforcement Learning on Trains

arXiv.org Artificial IntelligenceDec-11-2020

Efficient automated scheduling of trains remains a major challenge for modern railway systems. The underlying vehicle rescheduling problem (VRSP) has been a major focus of Operations Research (OR) since decades. Traditional approaches use complex simulators to study VRSP, where experimenting with a broad range of novel ideas is time consuming and has a huge computational overhead. In this paper, we introduce a two-dimensional simplified grid environment called "Flatland" that allows for faster experimentation. Flatland does not only reduce the complexity of the full physical simulation, but also provides an easy-to-use interface to test novel approaches for the VRSP, such as Reinforcement Learning (RL) and Imitation Learning (IL). In order to probe the potential of Machine Learning (ML) research on Flatland, we (1) ran a first series of RL and IL experiments and (2) design and executed a public Benchmark at NeurIPS 2020 to engage a large community of researchers to work on this problem. Our own experimental results, on the one hand, demonstrate that ML has potential in solving the VRSP on Flatland. On the other hand, we identify key topics that need further research. Overall, the Flatland environment has proven to be a robust and valuable framework to investigate the VRSP for railway networks. Our experiments provide a good starting point for further research and for the participants of the NeurIPS 2020 Flatland Benchmark. All of these efforts together have the potential to have a substantial impact on shaping the mobility of the future.

agent, experiment, flatland, (16 more...)

2012.05893

Country:

Europe > Germany (0.05)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Sweden > Skåne County > Malmö (0.04)
Asia > Singapore (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.86)

Industry: Transportation > Ground > Rail (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceDec-10-2020

Imitating Interactive Intelligence

Abramson, Josh, Ahuja, Arun, Brussee, Arthur, Carnevale, Federico, Cassin, Mary, Clark, Stephen, Dudzik, Andrew, Georgiev, Petko, Guy, Aurelia, Harley, Tim, Hill, Felix, Hung, Alden, Kenton, Zachary, Landon, Jessica, Lillicrap, Timothy, Mathewson, Kory, Muldal, Alistair, Santoro, Adam, Savinov, Nikolay, Varma, Vikrant, Wayne, Greg, Wong, Nathaniel, Yan, Chen, Zhu, Rui

A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central challenges of artificial intelligence (AI) research: complex visual perception and goal-directed physical control, grounded language comprehension and production, and multi-agent social interaction. To build agents that can robustly interact with humans, we would ideally train them while they interact with humans. However, this is presently impractical. Therefore, we approximate the role of the human with another learned agent, and use ideas from inverse reinforcement learning to reduce the disparities between human-human and agent-agent interactive behaviour. Rigorously evaluating our agents poses a great challenge, so we develop a variety of behavioural tests, including evaluation by humans who watch videos of agents or interact directly with them. These evaluations convincingly demonstrate that interactive training and auxiliary losses improve agent behaviour beyond what is achieved by supervised learning of actions alone. Further, we demonstrate that agent capabilities generalise beyond literal experiences in the dataset. Finally, we train evaluation models whose ratings of agents agree well with human judgement, thus permitting the evaluation of new agent models without additional effort. Taken together, our results in this virtual environment provide evidence that large-scale human behavioural imitation is a promising tool to create intelligent, interactive agents, and the challenge of reliably evaluating such agents is possible to surmount.

agent, instruction, interaction, (15 more...)

2012.05672

Country: Europe > United Kingdom > England > Greater London > London (0.04)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.45)

Industry:

Leisure & Entertainment > Games (1.00)
Education (1.00)
Transportation > Ground > Road (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)