AITopics

2201.11824

Country:

Europe > Germany > Bremen > Bremen (0.29)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > Queensland > Brisbane (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Tse, Hon Tik, Leung, Ho-fung

Exploiting Semantic Epsilon Greedy Exploration Strategy in Multi-Agent Reinforcement Learning

arXiv.org Artificial IntelligenceJan-26-2022

Multi-agent reinforcement learning (MARL) can model many real world applications. However, many MARL approaches rely on epsilon greedy for exploration, which may discourage visiting advantageous states in hard scenarios. In this paper, we propose a new approach QMIX(SEG) for tackling MARL. It makes use of the value function factorization method QMIX to train per-agent policies and a novel Semantic Epsilon Greedy (SEG) exploration strategy. SEG is a simple extension to the conventional epsilon greedy exploration strategy, yet it is experimentally shown to greatly improve the performance of MARL. We first cluster actions into groups of actions with similar effects and then use the groups in a bi-level epsilon greedy exploration hierarchy for action selection. We argue that SEG facilitates semantic exploration by exploring in the space of groups of actions, which have richer semantic meanings than atomic actions. Experiments show that QMIX(SEG) largely outperforms QMIX and leads to strong performance competitive with current state-of-the-art MARL approaches on the StarCraft Multi-Agent Challenge (SMAC) benchmark.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2201.10803

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceJan-26-2022

Constructing games on networks for controlling the inequalities in the capital distribution

Miszczak, Jarosław Adam

The inequality in capital or resource distribution is among the important phenomena observed in populations. The sources of inequality and methods for controlling it are of practical interest. To study this phenomenon, we introduce a model of interaction between agents in the network designed for reducing the inequality in the distribution of capital. To achieve the effect of inequality reduction, we interpret the outcome of the elementary game played in the network such that the wining of the game is translated into the reduction of the inequality. We study different interpretations of the introduced scheme and their impact on the behaviour of agents in the terms of the capital distribution, and we provide examples based on the capital dependent Parrondo's paradox. The results presented in this study provide insight into the mechanics of the inequality formation in the society.

agent, capital, inequality, (17 more...)

doi: 10.1016/j.physa.2022.126997

2201.10913

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Poland (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Communications (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.30)

arXiv.org Artificial IntelligenceJan-26-2022

Probe-Based Interventions for Modifying Agent Behavior

Tucker, Mycal, Kuhl, William, Shahid, Khizer, Karten, Seth, Sycara, Katia, Shah, Julie

Neural nets are powerful function approximators, but the behavior of a given neural net, once trained, cannot be easily modified. We wish, however, for people to be able to influence neural agents' actions despite the agents never training with humans, which we formalize as a human-assisted decision-making problem. Inspired by prior art initially developed for model explainability, we develop a method for updating representations in pre-trained neural nets according to externally-specified properties. In experiments, we show how our method may be used to improve human-agent team performance for a variety of neural networks from image classifiers to agents in multi-agent reinforcement learning settings.

intervention, probe, representation, (16 more...)

2201.12938

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

RobohubJan-25-2022, 20:41:43 GMT

Learning for Collaboration, Not Competition

Jakob Foerster an accredited Machine Learning Research Scientist who has been at the forefront of research on Multi-Agent Learning speaks with interviewer Kegan Strawn. Dr. Foerster explains why incorporating uncertainty into multi-agent interactions is essential to creating robust algorithms that can operate not only in games but in real-world applications. Jakob Foerster Jakob Foerster is an Associate Professor at the University of Oxford. His papers have gained prestigious awards at top machine learning conferences (ICML, AAAI) and have helped push deep multi-agent reinforcement learning to the forefront of AI research. Jakob previously worked at Facebook AI Research and received his Ph.D. from the University of Oxford under the supervision of Shimon Whiteson.

collaboration, competition, learning, (4 more...)

Robohub

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.57)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.80)

Carminati, Luca, Cacciamani, Federico, Ciccone, Marco, Gatti, Nicola

Public Information Representation for Adversarial Team Games

The peculiarity of adversarial team games resides in the asymmetric information available to the team members during the play, which makes the equilibrium computation problem hard even with zero-sum payoffs. The algorithms available in the literature work with implicit representations of the strategy space and mainly resort to Linear Programming and column generation techniques to enlarge incrementally the strategy space. Such representations prevent the adoption of standard tools such as abstraction generation, game solving, and subgame solving, which demonstrated to be crucial when solving huge, real-world two-player zero-sum games. Differently from these works, we answer the question of whether there is any suitable game representation enabling the adoption of those tools. In particular, our algorithms convert a sequential team game with adversaries to a classical two-player zero-sum game. In this converted game, the team is transformed into a single coordinator player who only knows information common to the whole team and prescribes to the players an action for any possible private state. Interestingly, we show that our game is more expressive than the original extensive-form game as any state/action abstraction of the extensive-form game can be captured by our representation, while the reverse does not hold. Due to the NP-hard nature of the problem, the resulting Public Team game may be exponentially larger than the original one. To limit this explosion, we provide three algorithms, each returning an information-lossless abstraction that dramatically reduces the size of the tree. These abstractions can be produced without generating the original game tree. Finally, we show the effectiveness of the proposed approach by presenting experimental results on Kuhn and Leduc Poker games, obtained by applying state-of-art algorithms for two-player zero-sum games on the converted games

information, node, private state, (16 more...)

2201.10377

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Werner, Mariel A., Angelopoulos, Anastasios, Bates, Stephen, Jordan, Michael I.

Online Active Learning with Dynamic Marginal Gain Thresholding

The blessing of ubiquitous data also comes with a curse: the communication, storage, and labeling of massive, mostly redundant datasets. In our work, we seek to solve the problem at its source, collecting only valuable data and throwing out the rest, via active learning. We propose an online algorithm which, given any stream of data, any assessment of its value, and any formulation of its selection cost, extracts the most valuable subset of the stream up to a constant factor while using minimal memory. Notably, our analysis also holds for the federated setting, in which multiple agents select online from individual data streams without coordination and with potentially very different appraisals of cost. One particularly important use case is selecting and labeling training sets from unlabeled collections of data that maximize the test-time performance of a given classifier. In prediction tasks on ImageNet and MNIST, we show that our selection method outperforms random selection by up to 5-20%.

dmgt, subset, threshold, (16 more...)

2201.10547

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

McShane, Marjorie, Leon, Ivan

Language Generation for Broad-Coverage, Explainable Cognitive Systems

This paper describes recent progress on natural language generation (NLG) for language-endowed intelligent agents (LEIAs) developed within the OntoAgent cognitive architecture. The approach draws heavily from past work on natural language understanding in this paradigm: it uses the same knowledge bases, theory of computational linguistics, agent architecture, and methodology of developing broad-coverage capabilities over time while still supporting near-term applications.

content specification, nlg tmr, tmr, (16 more...)

2201.10422

Country:

North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report (1.00)
Overview (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.89)

Aerospace Human System Integration Evolution over the Last 40 Years

Boy, Guy Andre

aerospace human system integration evolution, engineering application and case study, human system engineering application, (9 more...)

This chapter focuses on the evolution of Human-Centered Design (HCD) in aerospace systems over the last forty years. Human Factors and Ergonomics first shifted from the study of physical and medical issues to cognitive issues circa the 1980s. The advent of computers brought with it the development of human-computer interaction (HCI), which then expanded into the field of digital interaction design and User Experience (UX). We ended up with the concept of interactive cockpits, not because pilots interacted with mechanical things, but because they interacted using pointing devices on computer displays. Since the early 2000s, complexity and organizational issues gained prominence to the point that complex systems design and management found itself center stage, with the spotlight on the role of the human element and organizational setups. Today, Human Systems Integration (HSI) is no longer only a single-agent problem, but a multi-agent research field. Systems are systems of systems, considered as representations of people and machines. They are made of statically and dynamically articulated structures and functions. When they are at work, they are living organisms that generate emerging functions and structures that need to be considered in evolution (i.e., in their constant redesign). This chapter will more specifically, focus on human factors such as human-centered systemic representations, life critical systems, organizational issues, complexity management, modeling and simulation, flexibility, tangibility and autonomy. The discussion will be based on several examples in civil aviation and air combat, as well as aerospace.

2201.10275

Country:

North America > United States > New Jersey > Bergen County > Mahwah (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > Albany County > Albany (0.04)
(10 more...)

Genre: Research Report (0.40)

Industry:

Transportation > Air (1.00)
Government > Military (1.00)
Aerospace & Defense > Aircraft (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.76)

arXiv.org Artificial IntelligenceJan-24-2022

Dynamics-Aware Comparison of Learned Reward Functions

Wulfe, Blake, Balakrishna, Ashwin, Ellis, Logan, Mercat, Jean, McAllister, Rowan, Gaidon, Adrien

The ability to learn reward functions plays an important role in enabling the deployment of intelligent agents in the real world. However, comparing reward functions, for example as a means of evaluating reward learning methods, presents a challenge. Reward functions are typically compared by considering the behavior of optimized policies, but this approach conflates deficiencies in the reward function with those of the policy search algorithm used to optimize it. To address this challenge, Gleave et al. (2020) propose the Equivalent-Policy Invariant Comparison (EPIC) distance. EPIC avoids policy optimization, but in doing so requires computing reward values at transitions that may be impossible under the system dynamics. This is problematic for learned reward functions because it entails evaluating them outside of their training distribution, resulting in inaccurate reward values that we show can render EPIC ineffective at comparing rewards. To address this problem, we propose the Dynamics-Aware Reward Distance (DARD), a new reward pseudometric. DARD uses an approximate transition model of the environment to transform reward functions into a form that allows for comparisons that are invariant to reward shaping while only evaluating reward functions on transitions close to their training distribution. Experiments in simulated physical domains demonstrate that DARD enables reliable reward comparisons without policy optimization and is significantly more predictive than baseline methods of downstream policy performance when dealing with learned reward functions.

reward function, reward model, transition model, (14 more...)

2201.10081

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Industry: Transportation > Air (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)