AITopics

2111.14746

Country:

North America > United States > Illinois (0.04)
Africa > Togo (0.04)

Genre: Research Report (0.40)

Industry:

Banking & Finance > Trading (0.68)
Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Silva, Cleyton R., Bowling, Michael, Lelis, Levi H.S.

Teaching People by Justifying Tree Search Decisions: An Empirical Study in Curling

Journal of Artificial Intelligence ResearchNov-29-2021

In this research note we show that a simple justification system can be used to teach humans non-trivial strategies of the Olympic sport of curling. This is achieved by justifying the decisions of Kernel Regression UCT (KR-UCT), a tree search algorithm that derives curling strategies by playing the game with itself. Given an action returned by KR-UCT and the expected outcome of that action, we use a decision tree to produce a counterfactual justification of KR-UCT's decision. The system samples other possible outcomes and selects for presentation the outcomes that are most similar to the expected outcome in terms of visual features and most different in terms of expected end-game value. A user study with 122 people shows that the participants who had access to the justifications produced by our system achieved much higher scores in a curling test than those who only observed the decision made by KR-UCT and those with access to the justifications of a baseline system. This is, to the best of our knowledge, the first work showing that a justification system is able to teach humans non-trivial strategies learned by an algorithm operating in self play.

counterfactual state, justification, participant, (17 more...)

doi: 10.1613/jair.1.13219

AI Access Foundation

13219

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Brazil (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Leisure & Entertainment > Sports > Olympic Games (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.93)

Cornelio, Cristina, Goldsmith, Judy, Grandi, Umberto, Mattei, Nicholas, Rossi, Francesca, Venable, K. Brent

Reasoning with PCP-Nets

Journal of Artificial Intelligence ResearchNov-29-2021

We introduce PCP-nets, a formalism to model qualitative conditional preferences with probabilistic uncertainty. PCP-nets generalise CP-nets by allowing for uncertainty over the preference orderings. We define and study both optimality and dominance queries in PCP-nets, and we propose a tractable approximation of dominance which we show to be very accurate in our experimental setting. Since PCP-nets can be seen as a way to model a collection of weighted CP-nets, we also explore the use of PCP-nets in a multi-agent context, where individual agents submit CP-nets which are then aggregated into a single PCP-net. We consider various ways to perform such aggregation and we compare them via two notions of scores, based on well known voting theory concepts. Experimental results allow us to identify the aggregation method that better represents the given set of CP-nets and the most efficient dominance procedure to be used in the multi-agent context.

cp-net, pcp-net, probability, (15 more...)

doi: 10.1613/jair.1.13009

AI Access Foundation

13009

Country:

North America > United States > Kentucky > Fayette County > Lexington (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
(2 more...)

Genre:

Instructional Material (0.67)
Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Journal of Artificial Intelligence ResearchNov-29-2021

Steady-State Planning in Expected Reward Multichain MDPs

Atia, George K. | Beckus, Andre (Air Force Research Laboratory) | Alkhouri, Ismail | Velasquez, Alvaro

The planning domain has experienced increased interest in the formal synthesis of decision-making policies. This formal synthesis typically entails finding a policy which satisfies formal specifications in the form of some well-defined logic. While many such logics have been proposed with varying degrees of expressiveness and complexity in their capacity to capture desirable agent behavior, their value is limited when deriving decision-making policies which satisfy certain types of asymptotic behavior in general system models. In particular, we are interested in specifying constraints on the steady-state behavior of an agent, which captures the proportion of time an agent spends in each state as it interacts for an indefinite period of time with its environment. This is sometimes called the average or expected behavior of the agent and the associated planning problem is faced with significant challenges unless strong restrictions are imposed on the underlying model in terms of the connectivity of its graph structure. In this paper, we explore this steady-state planning problem that consists of deriving a decision-making policy for an agent such that constraints on its steady-state behavior are satisfied. A linear programming solution for the general case of multichain Markov Decision Processes (MDPs) is proposed and we prove that optimal solutions to the proposed programs yield stationary policies with rigorous guarantees of behavior.

constraint, markov chain, specification, (17 more...)

doi: 10.1613/jair.1.12611

AI Access Foundation

12611

Country:

North America > United States > New York (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Industry:

Government (0.46)
Law (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Sadek, Assem, Bono, Guillaume, Chidlovskii, Boris, Wolf, Christian

An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments

arXiv.org Artificial IntelligenceNov-29-2021

Visual navigation by mobile robots is classically tackled through SLAM plus optimal planning, and more recently through end-to-end training of policies implemented as deep networks. While the former are often limited to waypoint planning, but have proven their efficiency even on real physical environments, the latter solutions are most frequently employed in simulation, but have been shown to be able learn more complex visual reasoning, involving complex semantical regularities. Navigation by real robots in physical environments is still an open problem. End-to-end training approaches have been thoroughly tested in simulation only, with experiments involving real robots being restricted to rare performance evaluations in simplified laboratory conditions. In this work we present an in-depth study of the performance and reasoning capacities of real physical agents, trained in simulation and deployed to two different physical environments. Beyond benchmarking, we provide insights into the generalization capabilities of different agents training in different conditions. We visualize sensor usage and the importance of the different types of signals. We show, that for the PointGoal task, an agent pre-trained on wide variety of tasks and fine-tuned on a simulated version of the target environment can reach competitive performance without modelling any sim2real transfer, i.e. by deploying the trained agent directly from simulation to a real physical robot.

agent, robot, simulation, (14 more...)

2111.14666

Country: Europe > France (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.34)

arXiv.org Artificial IntelligenceNov-28-2021

How Can Creativity Occur in Multi-Agent Systems?

Fujimoto, Ted

Complex systems show how surprising and beautiful phenomena can emerge from structures or agents following simple rules. With the recent success of deep reinforcement learning (RL), a natural path forward would be to use the capabilities of multiple deep RL agents to produce emergent behavior of greater benefit and sophistication. In general, this has proved to be an unreliable strategy without significant computation due to the difficulties inherent in multi-agent RL training. In this paper, we propose some criteria for creativity in multi-agent RL. We hope this proposal will give artists applying multi-agent RL a starting point, and provide a catalyst for further investigation guided by philosophical discussion.

agent, diversity, multi-agent rl, (15 more...)

2111.1431

Country:

North America > United States > Washington > King County > Seattle (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)

Genre: Research Report (0.51)

Industry: Materials (0.35)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.35)

Guresti, Bengisu, Ure, Nazim Kemal

Evaluating Generalization and Transfer Capacity of Multi-Agent Reinforcement Learning Across Variable Number of Agents

arXiv.org Artificial IntelligenceNov-28-2021

Multi-agent Reinforcement Learning (MARL) problems often require cooperation among agents in order to solve a task. Centralization and decentralization are two approaches used for cooperation in MARL. While fully decentralized methods are prone to converge to suboptimal solutions due to partial observability and nonstationarity, the methods involving centralization suffer from scalability limitations and lazy agent problem. Centralized training decentralized execution paradigm brings out the best of these two approaches; however, centralized training still has an upper limit of scalability not only for acquired coordination performance but also for model size and training time. In this work, we adopt the centralized training with decentralized execution paradigm and investigate the generalization and transfer capacity of the trained models across variable number of agents. This capacity is assessed by training variable number of agents in a specific MARL problem and then performing greedy evaluations with variable number of agents for each training configuration. Thus, we analyze the evaluation performance for each combination of agent count for training versus evaluation. We perform experimental evaluations on predator prey and traffic junction environments and demonstrate that it is possible to obtain similar or higher evaluation performance by training with less agents. We conclude that optimal number of agents to perform training may differ from the target number of agents and argue that transfer across large number of agents can be a more efficient solution to scaling up than directly increasing number of agents during training.

agent, agent count, evaluation, (15 more...)

2111.14177

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.05)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Stastny, Julian, Riché, Maxime, Lyzhov, Alexander, Treutlein, Johannes, Dafoe, Allan, Clifton, Jesse

Normative Disagreement as a Challenge for Cooperative AI

arXiv.org Artificial IntelligenceNov-27-2021

Cooperation in settings where agents have both common and conflicting interests (mixed-motive environments) has recently received considerable attention in multi-agent learning. However, the mixed-motive environments typically studied have a single cooperative outcome on which all agents can agree. Many real-world multi-agent environments are instead bargaining problems (BPs): they have several Pareto-optimal payoff profiles over which agents have conflicting preferences. We argue that typical cooperation-inducing learning algorithms fail to cooperate in BPs when there is room for normative disagreement resulting in the existence of multiple competing cooperative equilibria, and illustrate this problem empirically. To remedy the issue, we introduce the notion of norm-adaptive policies. Norm-adaptive policies are capable of behaving according to different norms in different circumstances, creating opportunities for resolving normative disagreement. We develop a class of norm-adaptive policies and show in experiments that these significantly increase cooperation. However, norm-adaptiveness cannot address residual bargaining failure arising from a fundamental tradeoff between exploitability and cooperative robustness.

agent, normative disagreement, welfare function, (15 more...)

2111.13872

Country:

North America > Canada > Ontario > Toronto (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Panwar, Harsh, Chatterjee, Saswata, Dube, Wil

A Fast Evolutionary adaptation for MCTS in Pommerman

arXiv.org Artificial IntelligenceNov-26-2021

Artificial Intelligence, when amalgamated with games makes the ideal structure for research and advancing the field. Multi-agent games have multiple controls for each agent which generates huge amounts of data while increasing search complexity. Thus, we need advanced search methods to find a solution and create an artificially intelligent agent. In this paper, we propose our novel Evolutionary Monte Carlo Tree Search (FEMCTS) agent which borrows ideas from Evolutionary Algorthims (EA) and Monte Carlo Tree Search (MCTS) to play the game of Pommerman. It outperforms Rolling Horizon Evolutionary Algorithm (RHEA) significantly in high observability settings and performs almost as well as MCTS for most game seeds, outperforming it in some cases.

agent, mct, observability, (16 more...)

2111.1377

Country: Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

#artificialintelligenceNov-25-2021, 13:56:27 GMT

How does AI play football? An analysis of RL and real-world football strategies

Agent-based simulation (ABS) is a computationally demanding technique for simulating dynamic complex systems and observing "emergent" behaviour. With the use of ABS, we can explore different outcomes of phenomena where it is infeasible to conduct research testing and hypothesis formulations in real life. In the context of football we can use ABS to examine effects of different formations on match outcomes or study various play styles using millions of simulated football games. The availability of good simulation environments are critical to ABS. Fortunately, football has received a lot of attention in this field thanks to the long history of the RoboCup simulation track [itsuki1995soccer].

football, rl and real-world football strategy, simulation environment, (3 more...)

#artificialintelligence

Industry:

Leisure & Entertainment > Sports > Soccer (0.97)
Leisure & Entertainment > Sports > Football (0.97)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.62)