AITopics

2108.08236

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Journal of Artificial Intelligence ResearchAug-18-2021

Agent-Based Markov Modeling for Improved COVID-19 Mitigation Policies

Capobianco, Roberto (Sony AI & Sapienza University of Rome) | Kompella, Varun (Sony AI) | Ault, James (Texas A&M University) | Sharon, Guni (Texas A&M University) | Jong, Stacy (The University of Texas at Austin) | Fox, Spencer (The University of Texas at Austin) | Meyers, Lauren (The University of Texas at Austin) | Wurman, Peter R. (Sony AI) | Stone, Peter (Sony AI & The University of Texas at Austin)

The year 2020 saw the covid-19 virus lead to one of the worst global pandemics in history. As a result, governments around the world have been faced with the challenge of protecting public health while keeping the economy running to the greatest extent possible. Epidemiological models provide insight into the spread of these types of diseases and predict the effects of possible intervention policies. However, to date, even the most data-driven intervention policies rely on heuristics. In this paper, we study how reinforcement learning (RL) and Bayesian inference can be used to optimize mitigation policies that minimize economic impact without overwhelming hospital capacity. Our main contributions are (1) a novel agent-based pandemic simulator which, unlike traditional models, is able to model fine-grained interactions among people at specific locations in a community; (2) an RLbased methodology for optimizing fine-grained mitigation policies within this simulator; and (3) a Hidden Markov Model for predicting infected individuals based on partial observations regarding test results, presence of symptoms, and past physical contacts. This article is part of the special track on AI and COVID-19.

agent-based markov modeling, infection probability, probability, (13 more...)

doi: 10.1613/jair.1.12632

AI Access Foundation

12632

Country:

Europe > Sweden (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Texas > Brazos County > College Station (0.14)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Fu, Justin, Tacchetti, Andrea, Perolat, Julien, Bachrach, Yoram

Evaluating Strategic Structures in Multi-Agent Inverse Reinforcement Learning

Journal of Artificial Intelligence ResearchAug-18-2021

A core question in multi-agent systems is understanding the motivations for an agent's actions based on their behavior. Inverse reinforcement learning provides a framework for extracting utility functions from observed agent behavior, casting the problem as finding domain parameters which induce such a behavior from rational decision makers. We show how to efficiently and scalably extend inverse reinforcement learning to multi-agent settings, by reducing the multi-agent problem to N single-agent problems while still satisfying rationality conditions such as strong rationality. However, we observe that rewards learned naively tend to lack insightful structure, which causes them to produce undesirable behavior when optimized in games with different players from those encountered during training. We further investigate conditions under which rewards or utility functions can be precisely identified, on problem domains such as normal-form and Markov games, as well as auctions, where we show we can learn reward functions that properly generalize to new settings.

agent, auction, equilibrium, (14 more...)

doi: 10.1613/jair.1.12594

AI Access Foundation

12594

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > United Kingdom (0.14)

Industry: Leisure & Entertainment > Games > Computer Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Sudhakara, Sagar, Mahajan, Aditya, Nayyar, Ashutosh, Ouyang, Yi

Scalable regret for learning to control network-coupled subsystems with unknown dynamics

arXiv.org Artificial IntelligenceAug-18-2021

We consider the problem of controlling an unknown linear quadratic Gaussian (LQG) system consisting of multiple subsystems connected over a network. Our goal is to minimize and quantify the regret (i.e. loss in performance) of our strategy with respect to an oracle who knows the system model. Viewing the interconnected subsystems globally and directly using existing LQG learning algorithms for the global system results in a regret that increases super-linearly with the number of subsystems. Instead, we propose a new Thompson sampling based learning algorithm which exploits the structure of the underlying network. We show that the expected regret of the proposed algorithm is bounded by $\tilde{\mathcal{O}} \big( n \sqrt{T} \big)$ where $n$ is the number of subsystems, $T$ is the time horizon and the $\tilde{\mathcal{O}}(\cdot)$ notation hides logarithmic terms in $n$ and $T$. Thus, the regret scales linearly with the number of subsystems. We present numerical experiments to illustrate the salient features of the proposed algorithm.

algorithm, denote, subsystem, (15 more...)

2108.0797

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Mateo County > Burlingame (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceAug-18-2021

A Framework for Understanding AI-Induced Field Change: How AI Technologies are Legitimized and Institutionalized

Larsen, Benjamin Cedric

Artificial intelligence (AI) systems operate in increasingly diverse areas, from healthcare to facial recognition, the stock market, autonomous vehicles, and so on. While the underlying digital infrastructure of AI systems is developing rapidly, each area of implementation is subject to different degrees and processes of legitimization. By combining elements from institutional theory and information systems-theory, this paper presents a conceptual framework to analyze and understand AI-induced field-change. The introduction of novel AI-agents into new or existing fields creates a dynamic in which algorithms (re)shape organizations and institutions while existing institutional infrastructures determine the scope and speed at which organizational change is allowed to occur. Where institutional infrastructure and governance arrangements, such as standards, rules, and regulations, still are unelaborate, the field can move fast but is also more likely to be contested. The institutional infrastructure surrounding AI-induced fields is generally little elaborated, which could be an obstacle to the broader institutionalization of AI-systems going forward.

digital infrastructure, infrastructure, institutional infrastructure, (16 more...)

doi: 10.1145/3461702.3462591

2108.07804

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > Texas (0.04)
(9 more...)

Genre: Research Report (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.56)
(3 more...)

Mota, Mateus P., Valcarce, Alvaro, Gorce, Jean-Marie, Hoydis, Jakob

The Emergence of Wireless MAC Protocols with Multi-Agent Reinforcement Learning

arXiv.org Artificial IntelligenceAug-17-2021

In this paper, we propose a new framework, exploiting the multi-agent deep deterministic policy gradient (MADDPG) algorithm, to enable a base station (BS) and user equipment (UE) to come up with a medium access control (MAC) protocol in a multiple access scenario. In this framework, the BS and UEs are reinforcement learning (RL) agents that need to learn to cooperate in order to deliver data. The network nodes can exchange control messages to collaborate and deliver data across the network, but without any prior agreement on the meaning of the control messages. In such a framework, the agents have to learn not only the channel access policy, but also the signaling policy. The collaboration between agents is shown to be important, by comparing the proposed algorithm to ablated versions where either the communication between agents or the central critic is removed. The comparison with a contention-free baseline shows that our framework achieves a superior performance in terms of goodput and can effectively be used to learn a new protocol.

agent, communication, protocol, (14 more...)

2108.07144

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Genre: Research Report (0.64)

Industry:

Telecommunications (0.66)
Commercial Services & Supplies > Security & Alarm Services (0.55)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Kwa, Hian Lee, Kit, Jabez Leong, Bouffanais, Roland

Tracking Multiple Fast Targets With Swarms: Interplay Between Social Interaction and Agent Memory

arXiv.org Artificial IntelligenceAug-16-2021

The task of searching for and tracking of multiple targets is a challenging one. However, most works in this area do not consider evasive targets that move faster than the agents comprising the multi-robot system. This is due to the assumption that the movement patterns of such targets, combined with their excessive speed, would make the task nearly impossible to accomplish. In this work, we show that this is not the case and we propose a decentralized search and tracking strategy in which the level of exploration and exploitation carried out by the swarm is adjustable. By tuning a swarm's exploration and exploitation dynamics, we demonstrate that there exists an optimal balance between the level of exploration and exploitation performed. This optimum maximizes its tracking performance and changes depending on the number of targets and the targets' movement profiles. We also show that the use of agent-based memory is critical in enabling the tracking of an evasive target. The obtained simulation results are validated through experimental tests with a decentralized swarm of six robots tracking a virtual fast-moving target.

agent, evasive target, swarm, (15 more...)

doi: 10.1162/isal_a_00376

2108.07122

Country:

Asia > Singapore (0.05)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > New York (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

arXiv.org Artificial IntelligenceAug-16-2021

Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

Zheng, Fang, Wang, Le, Zhou, Sanping, Tang, Wei, Niu, Zhenxing, Zheng, Nanning, Hua, Gang

Understanding complex social interactions among agents is a key challenge for trajectory prediction. Most existing methods consider the interactions between pairwise traffic agents or in a local area, while the nature of interactions is unlimited, involving an uncertain number of agents and non-local areas simultaneously. Besides, they treat heterogeneous traffic agents the same, namely those among agents of different categories, while neglecting people's diverse reaction patterns toward traffic agents in ifferent categories. To address these problems, we propose a simple yet effective Unlimited Neighborhood Interaction Network (UNIN), which predicts trajectories of heterogeneous agents in multiple categories. Specifically, the proposed unlimited neighborhood interaction module generates the fused-features of all agents involved in an interaction simultaneously, which is adaptive to any number of agents and any range of interaction area. Meanwhile, a hierarchical graph attention module is proposed to obtain category-to-category interaction and agent-to-agent interaction. Finally, parameters of a Gaussian Mixture Model are estimated for generating the future trajectories. Extensive experimental results on benchmark datasets demonstrate a significant performance improvement of our method over the state-of-the-art methods.

agent, interaction, prediction, (14 more...)

2108.00238

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Guangxi Province > Nanning (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.69)

arXiv.org Artificial IntelligenceAug-14-2021

A Microscopic Pandemic Simulator for Pandemic Prediction Using Scalable Million-Agent Reinforcement Learning

Tang, Zhenggang, Yan, Kai, Sun, Liting, Zhan, Wei, Liu, Changliu

Microscopic epidemic models are powerful tools for government policy makers to predict and simulate epidemic outbreaks, which can capture the impact of individual behaviors on the macroscopic phenomenon. However, existing models only consider simple rule-based individual behaviors, limiting their applicability. This paper proposes a deep-reinforcement-learning-powered microscopic model named Microscopic Pandemic Simulator (MPS). By replacing rule-based agents with rational agents whose behaviors are driven to maximize rewards, the MPS provides a better approximation of real world dynamics. To efficiently simulate with massive amounts of agents in MPS, we propose Scalable Million-Agent DQN (SMADQN). The MPS allows us to efficiently evaluate the impact of different government strategies. This paper first calibrates the MPS against real-world data in Allegheny, US, then demonstratively evaluates two government strategies: information disclosure and quarantine. The results validate the effectiveness of the proposed method. As a broad impact, this paper provides novel insights for the application of DRL in large scale agent-based networks such as economic and social networks.

agent, infection, probability, (14 more...)

2108.06589

Country:

Oceania > Australia (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > France (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Gokcesu, Kaan, Gokcesu, Hakan

Optimal and Efficient Algorithms for General Mixable Losses against Switching Oracles

arXiv.org Machine LearningAug-13-2021

We investigate the problem of online learning, which has gained significant attention in recent years due to its applicability in a wide range of fields from machine learning to game theory. Specifically, we study the online optimization of mixable loss functions in a dynamic environment. We introduce online mixture schemes that asymptotically achieves the performance of the best dynamic estimation sequence of the switching oracle with optimal regret redundancies. The best dynamic estimation sequence that we compete against is selected in hindsight with full observation of the loss functions and is allowed to select different optimal estimations in different time intervals (segments). We propose two mixtures in our work. Firstly, we propose a tractable polynomial time complexity algorithm that can achieve the optimal redundancy of the intractable brute force approach. Secondly, we propose an efficient logarithmic time complexity algorithm that can achieve the optimal redundancy up to a constant multiplicity gap. Our results are guaranteed to hold in a strong deterministic sense in an individual sequence manner.

algorithm, base algorithm, redundancy, (15 more...)

arXiv.org Machine Learning

2108.06411

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Industry:

Education (0.48)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)