AITopics | Agents

Collaborating Authors

Agents

News Overviews Instructional Materials AI-Alerts Classics

An Option and Agent Selection Policy with Logarithmic Regret for Multi Agent Multi Armed Bandit Problems on Random Graphs

Pankayaraj, Pathmanathan, Maithripala, D. H. S.

arXiv.org Machine LearningOct-15-2019

Existing studies of the Multi Agent Multi Armed Bandit (MAMAB) problem, with the exception of a very few, consider the case where the agents observe their neighbors according to a static network graph. They also mostly rely on a running consensus for the estimation of the option rewards. Two of the exceptions consider a problem where agents observe instantaneous rewards and actions of their neighbors through an iid ER graph process based communication strategy. In this paper we propose a UCB based option allocation rule that guarantees logarithmic regret even if the graph depends on the history of choices made by the agents. The paper also proposes a novel communication strategy that significantly outperforms the iid ER graph based communication strategy. In both the ER graph and the dependent graph strategy, the regret is shown to depend on the connectivity of the graph in a particularly interesting way where there exists an optimal connectivity of the graph that is less than the full connectivity of the graph.

agent, allocation rule, connectivity, (17 more...)

arXiv.org Machine Learning

1910.02635

Country:

Asia > Sri Lanka (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.37)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Challenges of Human-Aware AI Systems

Kambhampati, Subbarao

arXiv.org Artificial IntelligenceOct-15-2019

From its inception, AI has had a rather ambivalent relationship to humans---swinging between their augmentation and replacement. Now, as AI technologies enter our everyday lives at an ever increasing pace, there is a greater need for AI systems to work synergistically with humans. To do this effectively, AI systems must pay more attention to aspects of intelligence that helped humans work with each other---including social intelligence. I will discuss the research challenges in designing such human-aware AI systems, including modeling the mental states of humans in the loop, recognizing their desires and intentions, providing proactive support, exhibiting explicable behavior, giving cogent explanations on demand, and engendering trust. I will survey the progress made so far on these challenges, and highlight some promising directions. I will also touch on the additional ethical quandaries that such systems pose. I will end by arguing that the quest for human-aware AI systems broadens the scope of AI enterprise, necessitates and facilitates true inter-disciplinary collaborations, and can go a long way towards increasing public acceptance of AI technologies.

ai agent, explanation, kambhampati, (15 more...)

arXiv.org Artificial Intelligence

1910.07089

Country:

North America > United States > Arizona (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
(2 more...)

Genre:

Overview (0.66)
Research Report (0.50)
Personal (0.46)

Industry: Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(3 more...)

Add feedback

Multiagent Rollout Algorithms and Reinforcement Learning

Bertsekas, Dimitri

arXiv.org Artificial IntelligenceOct-15-2019

We consider finite and infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. We introduce an algorithm, whereby at every stage, each agent's decision is made by executing a local rollout algorithm that uses a base policy, together with some coordinating information from the other agents. The amount of local computation required at every stage by each agent is independent of the number of agents, while the amount of global computation (over all agents) grows linearly with the number of agents. By contrast, with the standard rollout algorithm, the amount of global computation grows exponentially with the number of agents. Despite the drastic reduction in required computation, we show that our algorithm has the fundamental cost improvement property of rollout: an improved performance relative to the base policy. We also explore related reinforcement learning and approximate policy iteration algorithms, and we discuss how this cost improvement property is affected when we attempt to improve further the method's computational efficiency through parallelization of the agents' computations.

agent, algorithm, rollout algorithm, (10 more...)

arXiv.org Artificial Intelligence

1910.0012

Country:

North America > United States > Massachusetts > Middlesex County > Belmont (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.74)

Add feedback

Optimal Clustering from Noisy Binary Feedback

Ariu, Kaito, Ok, Jungseul, Proutiere, Alexandre, Yun, Se-Young

arXiv.org Machine LearningOct-14-2019

We study the problem of recovering clusters from binary user feedback. Items are grouped into initially unknown non-overlapping clusters. To recover these clusters, the learner sequentially presents to users a finite list of items together with a question with a binary answer selected from a fixed finite set. For each of these items, the user provides a random answer whose expectation is determined by the item cluster and the question and by an item-specific parameter characterizing the hardness of classifying the item. The objective is to devise an algorithm with a minimal cluster recovery error rate. We derive problem-specific information-theoretical lower bounds on the error rate satisfied by any algorithm, for both uniform and adaptive (list, question) selection strategies. For uniform selection, we present a simple algorithm built upon K-means whose performance almost matches the fundamental limits. For adaptive selection, we develop an adaptive algorithm that is inspired by the derivation of the information-theoretical error lower bounds, and in turn allocates the budget in an efficient way. The algorithm learns to select items hard to cluster and relevant questions more often. We compare numerically the performance of our algorithms with or without adaptive selection strategy, and illustrate the gain achieved by being adaptive. Our inference problems are motivated by the problem of solving large-scale labeling tasks with minimal effort put on the users. For example, in some of the recent CAPTCHA systems, users clicks (binary answers) can be used to efficiently label images, by optimally finding the best questions to present.

algorithm, error rate, inull, (16 more...)

arXiv.org Machine Learning

1910.06002

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Visual Hide and Seek

Chen, Boyuan, Song, Shuran, Lipson, Hod, Vondrick, Carl

arXiv.org Artificial IntelligenceOct-14-2019

V ISUAL HIDE AND SEEK Boyuan Chen Columbia University Shuran Song Columbia University Hod Lipson Columbia University Carl V ondrick Columbia University A BSTRACT We train embodied agents to play Visual Hide and Seek where a prey must navigate in a simulated environment in order to avoid capture from a predator. We place a variety of obstacles in the environment for the prey to hide behind, and we only give the agents partial observations of their environment using an egocentric perspective. Although we train the model to play this game from scratch, experiments and visualizations suggest that the agent learns to predict its own visibility in the environment. Furthermore, we quantitatively analyze how agent weaknesses, such as slower speed, effect the learned policy. Our results suggest that, although agent weaknesses make the learning problem more challenging, they also cause more useful features to be learned. Our project website is available at: http://www.cs.columbia.edu/ We designed this game to mimic the typical dynamics between predator and prey. For example, we place a variety of obstacles inside the environment, which create occlusions that the agent can leverage to hide behind. We also only give the agents access to the first-person perspective of their three-dimensional environment. Consequently, this task is a substantial challenge for reinforcement learning because the state is both visual (pixel input) and partially observable (due to occlusions).

agent, arxiv preprint arxiv, representation, (13 more...)

arXiv.org Artificial Intelligence

1910.07882

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

Federated Transfer Reinforcement Learning for Autonomous Driving

Liang, Xinle, Liu, Yang, Chen, Tianjian, Liu, Ming, Yang, Qiang

arXiv.org Artificial IntelligenceOct-14-2019

Xinle Liang 1, Y ang Liu 1, Tianjian Chen 1, Ming Liu 2 and Qiang Y ang 1 Abstract -- Reinforcement learning (RL) is widely used in autonomous driving tasks and training RL models typically involves in a multi-step process: pre-training RL models on simulators, uploading the pre-trained model to real-life robots, and fine-tuning the weight parameters on robot vehicles. This sequential process is extremely time-consuming and more importantly, knowledge from the fine-tuned model stays local and can not be reused or leveraged collaboratively. T o tackle this problem, we present an online federated RL transfer process for real-time knowledge extraction where all the participant agents make corresponding actions with the knowledge learned by others, even when they are acting in very different environments. T o validate the effectiveness of the proposed approach, we constructed a real-life collision avoidance system with Microsoft Airsim simulator and NVIDIA JetsonTX2 car agents, which cooperatively learn from scratch to avoid collisions in indoor environment with obstacle objects. We demonstrate that with the proposed framework, the simulator car agents can transfer knowledge to the RC cars in real-time, with 27% increase in the average distance with obstacles and 42% decrease in the collision counts. I. INTRODUCTION Recent Reinforcement Learning (RL) researches in autonomous robots have achieved significant performance improvement by employing distributed architecture for decentralized agents [1], [2], which is termed as Distributed Reinforcement Learning (DRL). However, most existing DRL frameworks consider only synchronous learning with a constant environment.

agent, knowledge, learning, (14 more...)

arXiv.org Artificial Intelligence

1910.06001

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Robotics & Automation (0.86)
Transportation > Ground > Road (0.72)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Investorideas.com Newswire - AI Stock News: GBT (OTCPINK: GTCH) Implementing New Approach within its Intelligent Agent

#artificialintelligenceOct-13-2019, 16:02:20 GMT

Newswire) GBT Technologies Inc. (OTCPINK: GTCH) ("GBT", or the "Company"), a company specializing in the development of Internet of Things (IoT) and Artificial Intelligence (AI) enabled networking and tracking technologies, including its GopherInsight wireless mesh network technology platform for both mobile and fixed solutions, announced that it is now implementing a new approach within its intelligent agent, recurrent relational reasoning (RRN). The new set of algorithms enables GBT's AI system to explicitly consider relations between objects (Static, moving), or abstract ideas. The RRN methodology will be implemented within Avant! AI within the next months, enabling it with logic analysis boost to handle vast information and data interpretation complexity. One of the key reasons for implementing this new method is to achieve outstanding image-based reasoning tasks for Avant!

forward-looking statement, investoridea, relation, (12 more...)

#artificialintelligence

Country: North America > United States > California > Los Angeles County > Santa Monica (0.05)

Genre: Press Release (0.63)

Industry:

Media > News (0.31)
Law > Intellectual Property & Technology Law (0.30)
Banking & Finance > Trading (0.30)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

On the Utility of Learning about Humans for Human-AI Coordination

Carroll, Micah, Shah, Rohin, Ho, Mark K., Griffiths, Thomas L., Seshia, Sanjit A., Abbeel, Pieter, Dragan, Anca

arXiv.org Artificial IntelligenceOct-13-2019

While we would like agents that can coordinate with humans, current algorithms such as self-play and population-based training create agents that can coordinate with themselves. Agents that assume their partner to be optimal or similar to them can converge to coordination protocols that fail to understand and be understood by humans. To demonstrate this, we introduce a simple environment that requires challenging coordination, based on the popular game Overcooked, and learn a simple model that mimics human play. We evaluate the performance of agents trained via self-play and population-based training. These agents perform very well when paired with themselves, but when paired with our human model, they are significantly worse than agents designed to play with the human model. An experiment with a planning algorithm yields the same conclusion, though only when the human-aware planner is given the exact human model that it is playing with. A user study with real humans shows this pattern as well, though less strongly. Qualitatively, we find that the gains come from having the agent adapt to the human's gameplay. Given this result, we suggest several approaches for designing agents that learn about humans in order to better coordinate with them. Code is available at https://github.com/HumanCompatibleAI/overcooked_ai.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1910.05789

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Learning Everywhere: A Taxonomy for the Integration of Machine Learning and Simulations

Fox, Geoffrey, Jha, Shantenu

arXiv.org Machine LearningOct-13-2019

We present a taxonomy of research on Machine Learning (ML) applied to enhance simulations together with a catalog of some activities. We cover eight patterns for the link of ML to the simulations or systems plus three algorithmic areas: particle dynamics, agent-based models and partial differential equations. The patterns are further divided into three action areas: Improving simulation with Configurations and Integration of Data, Learn Structure, Theory and Model for Simulation, and Learn to make Surrogates.

machine learning, mlaroundhpc, simulation, (13 more...)

arXiv.org Machine Learning

1909.1334

Country:

North America > United States > District of Columbia > Washington (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Japan > Honshū > Kansai > Hyogo Prefecture > Kobe (0.04)
(7 more...)

Genre: Research Report (0.42)

Industry:

Health & Medicine > Therapeutic Area (0.94)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)

Add feedback

Deep Crowd-Flow Prediction in Built Environments

Sohn, Samuel S., Moon, Seonghyeon, Zhou, Honglu, Yoon, Sejong, Pavlovic, Vladimir, Kapadia, Mubbasir

arXiv.org Artificial IntelligenceOct-13-2019

Predicting the behavior of crowds in complex environments is a key requirement in a multitude of application areas, including crowd and disaster management, architectural design, and urban planning. Given a crowd's immediate state, current approaches simulate crowd movement to arrive at a future state. However, most applications require the ability to predict hundreds of possible simulation outcomes (e.g., under different environment and crowd situations) at real-time rates, for which these approaches are prohibitively expensive. In this paper, we propose an approach to instantly predict the long-term flow of crowds in arbitrarily large, realistic environments. Central to our approach is a novel CAGE representation consisting of Capacity, Agent, Goal, and Environment-oriented information, which efficiently encodes and decodes crowd scenarios into compact, fixed-size representations that are environmentally lossless. We present a framework to facilitate the accurate and efficient prediction of crowd flow in never-before-seen crowd scenarios. We conduct a series of experiments to evaluate the efficacy of our approach and showcase positive results.

agent, crowd flow, prediction, (14 more...)

arXiv.org Artificial Intelligence

1910.0581

Country:

North America > United States > New Jersey (0.04)
North America > Canada (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.89)

Add feedback