AITopics

2010.04914

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(8 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Zhang, Kaiqing, Kakade, Sham M., Başar, Tamer, Yang, Lin F.

Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

arXiv.org Machine LearningOct-9-2020

Model-based reinforcement learning (RL), which finds an optimal policy using an empirical model, has long been recognized as one of the corner stones of RL. It is especially suitable for multi-agent RL (MARL), as it naturally decouples the learning and the planning phases, and avoids the non-stationarity problem when all agents are improving their policies simultaneously using samples. Though intuitive and widely-used, the sample complexity of model-based MARL algorithms has not been fully investigated. In this paper, our goal is to address the fundamental question about its sample complexity. We study arguably the most basic MARL setting: two-player discounted zero-sum Markov games, given only access to a generative model. We show that model-based MARL achieves a sample complexity of $\tilde O(|S||A||B|(1-\gamma)^{-3}\epsilon^{-2})$ for finding the Nash equilibrium (NE) value up to some $\epsilon$ error, and the $\epsilon$-NE policies with a smooth planning oracle, where $\gamma$ is the discount factor, and $S,A,B$ denote the state space, and the action spaces for the two agents. We further show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge, by establishing a matching lower bound. This is in contrast to the usual reward-aware setting, with a $\tilde\Omega(|S|(|A|+|B|)(1-\gamma)^{-3}\epsilon^{-2})$ lower bound, where this model-based approach is near-optimal with only a gap on the $|A|,|B|$ dependence. Our results not only demonstrate the sample-efficiency of this basic model-based approach in MARL, but also elaborate on the fundamental tradeoff between its power (easily handling the more challenging reward-agnostic case) and limitation (less adaptive and suboptimal in $|A|,|B|$), particularly arises in the multi-agent context.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2007.07461

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Illinois (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre: Research Report (0.84)

Industry:

Leisure & Entertainment > Games (0.67)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Esmaeili, Ahmad, Gallagher, John C., Springer, John A., Matson, Eric T.

HAMLET: A Hierarchical Agent-based Machine Learning Platform

Hierarchical Multi-Agent Systems provide a convenient and relevant way to analyze, model, and simulate complex systems in which a large number of entities are interacting at different levels of abstraction. In this paper, we introduce HAMLET (Hierarchical Agent-based Machine LEarning plaTform), a platform based on hierarchical multi-agent systems, to facilitate the research and democratization of machine learning entities distributed geographically or locally. This is carried out by firstly modeling the machine learning solutions as a hypergraph and then autonomously setting up a multi-level structure composed of heterogeneous agents based on their innate capabilities and learned skills. HAMLET aids the design and management of machine learning systems and provides analytical capabilities for the research communities to assess the existing and/or new algorithms/datasets through flexible and customizable queries. The proposed platform does not assume restrictions on the type of machine learning algorithms/datasets and is theoretically proven to be sound and complete with polynomial computational requirements. Additionally, it is examined empirically on 120 training and four generalized batch testing tasks performed on 24 machine learning algorithms and 9 standard datasets. The experimental results provided not only establish confidence in the platform's consistency and correctness but also demonstrates its testing and analytical capacity.

artificial intelligence, holon, machine learning, (18 more...)

2010.04894

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > United States > Ohio > Hamilton County > Cincinnati (0.04)
(3 more...)

Genre: Research Report (0.63)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Dominguez, Maya Abo, La, William, Boerkoel, James C. Jr

Modeling Human Temporal Uncertainty in Human-Agent Teams

Automated scheduling is potentially a very useful tool for facilitating efficient, intuitive interactions between a robot and a human teammate. However, a current gapin automated scheduling is that it is not well understood how to best represent the timing uncertainty that human teammates introduce. This paper attempts to address this gap by designing an online human-robot collaborative packaging game that we use to build a model of human timing uncertainty from a population of crowd-workers. We conclude that heavy-tailed distributions are the best models of human temporal uncertainty, with a Log-Normal distribution achieving the best fit to our experimental data. We discuss how these results along with our collaborative online game will inform and facilitate future explorations into scheduling for improved human-robot fluency.

artificial intelligence, fluency, interaction, (16 more...)

2010.04849

Country: North America > United States > California > Los Angeles County > Claremont (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.64)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Weber, Tom, Wermter, Stefan

Integrating Intrinsic and Extrinsic Explainability: The Relevance of Understanding Neural Networks for Human-Robot Interaction

Explainable artificial intelligence (XAI) can help foster trust in and acceptance of intelligent and autonomous systems. Moreover, understanding the motivation for an agent's behavior results in better and more successful collaborations between robots and humans. However, not only can humans benefit from a robot's explanation but the robot itself can also benefit from explanations given to him. Currently, most attention is paid to explaining deep neural networks and black-box models. However, a lot of these approaches are not applicable to humanoid robots. Therefore, in this position paper, current problems with adapting XAI methods to explainable neurorobotics are described. Furthermore, NICO, an open-source humanoid robot platform, is introduced and how the interaction of intrinsic explanations by the robot itself and extrinsic explanations provided by the environment enable efficient robotic behavior.

artificial intelligence, explanation, machine learning, (16 more...)

2010.04602

Country: North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Bioinspired Bipedal Locomotion Control for Humanoid Robotics Based on EACO

Yang, Jingan, Peng, Yang

To construct a robot that can walk as efficiently and steadily as humans or other legged animals, we develop an enhanced elitist-mutated ant colony optimization~(EACO) algorithm with genetic and crossover operators in real-time applications to humanoid robotics or other legged robots. This work presents promoting global search capability and convergence rate of the EACO applied to humanoid robots in real-time by estimating the expected convergence rate using Markov chain. Furthermore, we put a special focus on the EACO algorithm on a wide range of problems, from ACO, real-coded GAs, GAs with neural networks~(NNs), particle swarm optimization~(PSO) to complex robotics systems including gait synthesis, dynamic modeling of parameterizable trajectories and gait optimization of humanoid robotics. The experimental results illustrate the capability of this method to discover the premature convergence probability, tackle successfully inherent stagnation, and promote the convergence rate of the EACO-based humanoid robotics systems and demonstrated the applicability and the effectiveness of our strategy for solving sophisticated optimization tasks. We found reliable and fast walking gaits with a velocity of up to 0.47m/s using the EACO optimization strategy. These findings have significant implications for understanding and tackling inherent stagnation and poor convergence rate of the EACO and provide new insight into the genetic architectures and control optimization of humanoid robotics.

artificial intelligence, evolutionary algorithm, machine learning, (18 more...)

2010.04463

Country:

Asia > China > Jiangsu Province > Changzhou (0.04)
Asia > China > Anhui Province > Hefei (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Xiao, Ruichao, Singh, Manish Kumar, Yu, Rose

Dynamic Relational Inference in Multi-Agent Trajectories

arXiv.org Machine LearningOct-8-2020

Inferring interactions from multi-agent trajectories has broad applications in physics, vision and robotics. Neural relational inference (NRI) is a deep generative model that can reason about relations in complex dynamics without supervision. In this paper, we take a careful look at this approach for relational inference in multi-agent trajectories. First, we discover that NRI can be fundamentally limited without sufficient long-term observations. Its ability to accurately infer interactions degrades drastically for short output sequences. Next, we consider a more general setting of relational inference when interactions are changing overtime. We propose an extension ofNRI, which we call the DYnamic multi-AgentRelational Inference (DYARI) model that can reason about dynamic relations. We conduct exhaustive experiments to study the effect of model architecture, under-lying dynamics and training scheme on the performance of dynamic relational inference using a simulated physics system. We also showcase the usage of our model on real-world multi-agent basketball trajectories.

artificial intelligence, machine learning, relation, (16 more...)

arXiv.org Machine Learning

2007.13524

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

arXiv.org Machine LearningOct-7-2020

Learning Theory for Inferring Interaction Kernels in Second-Order Interacting Agent Systems

Miller, Jason, Tang, Sui, Zhong, Ming, Maggioni, Mauro

Modeling the complex interactions of systems of particles or agents is a fundamental scientific and mathematical problem that is studied in diverse fields, ranging from physics and biology, to economics and machine learning. In this work, we describe a very general second-order, heterogeneous, multivariable, interacting agent model, with an environment, that encompasses a wide variety of known systems. We describe an inference framework that uses nonparametric regression and approximation theory based techniques to efficiently derive estimators of the interaction kernels which drive these dynamical systems. We develop a complete learning theory which establishes strong consistency and optimal nonparametric min-max rates of convergence for the estimators, as well as provably accurate predicted trajectories. The estimators exploit the structure of the equations in order to overcome the curse of dimensionality and we describe a fundamental coercivity condition on the inverse problem which ensures that the kernels can be learned and relates to the minimal singular value of the learning matrix. The numerical algorithm presented to build the estimators is parallelizable, performs well on high-dimensional problems, and is demonstrated on complex dynamical systems.

artificial intelligence, kk null, machine learning, (17 more...)

arXiv.org Machine Learning

2010.03729

Country:

North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.84)

arXiv.org Artificial IntelligenceOct-7-2020

Near-Optimal Regret Bounds for Model-Free RL in Non-Stationary Episodic MDPs

Mao, Weichao, Zhang, Kaiqing, Zhu, Ruihao, Simchi-Levi, David, Başar, Tamer

Reinforcement learning (RL) studies the class of problems where an agent maximizes its cumulative reward through sequential interaction with an unknown but fixed environment, usually modeled by a Markov Decision Process (MDP). At each time step, the agent takes an action, receives a random reward drawn from a reward function, and then the environment transitions to a new state according to an unknown transition kernel. In classical RL problems, the transition kernel and the reward functions are assumed to be time-invariant. This stationary model, however, cannot capture the phenomenon that in many real-world decision-making problems, the environment, including both the transition dynamics and the reward functions, is inherently evolving over time. Non-stationarity exists in a wide range of applications, including online advertisement auctions (Cai et al., 2017; Lu et al., 2019), dynamic pricing (Board, 2008; Chawla et al., 2016), traffic management (Chen et al., 2020), healthcare operations (Shortreed et al., 2011), and inventory control (Agrawal & Jia, 2019). Among the many intriguing applications, we specifically emphasize two research areas that can significantly benefit from progress on non-stationary RL, yet their connections have been largely overlooked in the literature. The first one is sequential transfer in RL (Tirinzoni et al., 2020) or multitask RL Brunskill & Li (2013).

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2010.03161

Country:

North America > United States > Illinois (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Workflow (0.92)
Research Report (0.64)

Industry:

Health & Medicine (0.49)
Transportation (0.34)
Marketing (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

AI MagazineOct-6-2020, 16:42:01 GMT

Intelligent Agents for Interactive Simulation Environments

Interactive simulation environments constitute one of today's promising emerging technologies, with applications in areas such as education, manufacturing, entertainment, and training. These environments are also rich domains for building and investigating intelligent automated agents, with requirements for the integration of a variety of agent capabilities but without the costs and demands of low-level perceptual processing or robotic control. Our project is aimed at developing humanlike, intelligent agents that can interact with each other, as well as with humans, in such virtual environments. Our current target is intelligent automated pilots for battlefield-simulation environments. These dynamic, interactive, multiagent environments pose interesting challenges for research on specialized agent capabilities as well as on the integration of these capabilities in the development of "complete" pilot agents.

artificial intelligence, intelligent agent, interactive simulation environment, (3 more...)

AI Magazine

Industry: Leisure & Entertainment > Games > Computer Games (0.90)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)