AITopics

doi: 10.1145/3411764.3445260

2102.00625

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
North America > United States > Virginia (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.93)

Industry:

Law (1.00)
Health & Medicine (0.68)
Information Technology (0.67)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

#artificialintelligenceJan-30-2021, 22:27:10 GMT

US should not ban AI weapons - panel says autonomous systems can save lives

The National Security Commission on Artificial Intelligence (NSCAI) has advised the US government not to ban the development of AI-powered autonomous weapons, saying that they can - counterintuitively...

autonomous system, ban ai weapon, save lives

#artificialintelligence

Country: North America > United States (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.58)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.40)

Seddighin, Masoud ( Institute for Research in Fundamental Sciences (IPM) -- School of CS) | Latifian, Mohammad (Sharif University of Technology) | Ghodsi, Mohammad (Sharif University of Technology, Institute for Research in Fundamental Sciences (IPM) -- School of CS)

On the Distortion Value of Elections with Abstention

Journal of Artificial Intelligence ResearchJan-29-2021

In Spatial Voting Theory, distortion is a measure of how good the winner is. It has been proved that no deterministic voting mechanism can guarantee a distortion better than 3, even for simple metrics such as a line. In this study, we wish to answer the following question: how does the distortion value change if we allow less motivated agents to abstain from the election? We consider an election with two candidates and suggest an abstention model, which is a general form of the abstention model proposed by Kirchgässner. Our results characterize the distortion ¨ value and provide a rather complete picture of the model.

displacement, distortion, voter, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12306

AI Access Foundation

12306

Journal of Artificial Intelligence Research

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Government > Voting & Elections (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

arXiv.org Artificial IntelligenceJan-29-2021

Counterfactual Planning in AGI Systems

Holtman, Koen

We present counterfactual planning as a design approach for creating a range of safety mechanisms that can be applied in hypothetical future AI systems which have Artificial General Intelligence. The key step in counterfactual planning is to use an AGI machine learning system to construct a counterfactual world model, designed to be different from the real world the system is in. A counterfactual planning agent determines the action that best maximizes expected utility in this counterfactual planning world, and then performs the same action in the real world. We use counterfactual planning to construct an AGI agent emergency stop button, and a safety interlock that will automatically stop the agent before it undergoes an intelligence explosion. We also construct an agent with an input terminal that can be used by humans to iteratively improve the agent's reward function, where the incentive for the agent to manipulate this improvement process is suppressed. As an example of counterfactual planning in a non-agent AGI system, we construct a counterfactual oracle. As a design approach, counterfactual planning is built around the use of a graphical notation for defining mathematical counterfactuals. This two-diagram notation also provides a compact and readable language for reasoning about the complex types of self-referencing and indirect representation which are typically present inside machine learning agents.

agent, diagram, planning world, (15 more...)

2102.00834

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

arXiv.org Artificial IntelligenceJan-29-2021

Sequential Mechanisms for Multi-type Resource Allocation

Sikdar, Sujoy, Guo, Xiaoxi, Wang, Haibin, Xia, Lirong, Cao, Yongzhi

Several resource allocation problems involve multiple types of resources, with a different agency being responsible for "locally" allocating the resources of each type, while a central planner wishes to provide a guarantee on the properties of the final allocation given agents' preferences. We study the relationship between properties of the local mechanisms, each responsible for assigning all of the resources of a designated type, and the properties of a sequential mechanism which is composed of these local mechanisms, one for each type, applied sequentially, under lexicographic preferences, a well studied model of preferences over multiple types of resources in artificial intelligence and economics. We show that when preferences are O-legal, meaning that agents share a common importance order on the types, sequential mechanisms satisfy the desirable properties of anonymity, neutrality, non-bossiness, or Pareto-optimality if and only if every local mechanism also satisfies the same property, and they are applied sequentially according to the order O. Our main results are that under O-legal lexicographic preferences, every mechanism satisfying strategyproofness and a combination of these properties must be a sequential composition of local mechanisms that are also strategyproof, and satisfy the same combinations of properties.

agent, contradiction, mechanism, (15 more...)

2101.12522

Country:

North America > United States > New York > Broome County > Binghamton (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Genre: Research Report (0.63)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Lindner, David, Matoba, Kyle, Meulemans, Alexander

Challenges for Using Impact Regularizers to Avoid Negative Side Effects

arXiv.org Artificial IntelligenceJan-29-2021

Designing reward functions for reinforcement learning is difficult: besides specifying which behavior is rewarded for a task, the reward also has to discourage undesired outcomes. Misspecified reward functions can lead to unintended negative side effects, and overall unsafe behavior. To overcome this problem, recent work proposed to augment the specified reward function with an impact regularizer that discourages behavior that has a big impact on the environment. Although initial results with impact regularizers seem promising in mitigating some types of side effects, important challenges remain. In this paper, we examine the main current challenges of impact regularizers and relate them to fundamental design decisions. We discuss in detail which challenges recent approaches address and which remain unsolved. Finally, we explore promising directions to overcome the unsolved challenges in preventing negative side effects with impact regularizers.

agent, baseline, side effect, (15 more...)

2101.12509

Country:

North America > United States (0.14)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Sokar, Ghada, Mocanu, Decebal Constantin, Pechenizkiy, Mykola

Self-Attention Meta-Learner for Continual Learning

Continual learning aims to provide intelligent agents capable of learning multiple tasks sequentially with neural networks. One of its main challenging, catastrophic forgetting, is caused by the neural networks non-optimal ability to learn in non-stationary distributions. In most settings of the current approaches, the agent starts from randomly initialized parameters and is optimized to master the current task regardless of the usefulness of the learned representation for future tasks. Moreover, each of the future tasks uses all the previously learned knowledge although parts of this knowledge might not be helpful for its learning. These cause interference among tasks, especially when the data of previous tasks is not accessible. In this paper, we propose a new method, named Self-Attention Meta-Learner (SAM), which learns a prior knowledge for continual learning that permits learning a sequence of tasks, while avoiding catastrophic forgetting. SAM incorporates an attention mechanism that learns to select the particular relevant representation for each future task. Each task builds a specific representation branch on top of the selected knowledge, avoiding the interference between tasks. We evaluate the proposed method on the Split CIFAR-10/100 and Split MNIST benchmarks in the task agnostic inference. We empirically show that we can achieve a better performance than several state-of-the-art methods for continual learning by building on the top of selected representation learned by SAM. We also show the role of the meta-attention mechanism in boosting informative features corresponding to the input data and identifying the correct target in the task agnostic inference. Finally, we demonstrate that popular existing continual learning methods gain a performance boost when they adopt SAM as a starting point.

knowledge, learning, representation, (13 more...)

2101.12136

Country: Europe > Netherlands > North Brabant > Eindhoven (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Derman, Esther, Dalal, Gal, Mannor, Shie

Acting in Delayed Environments with Non-Stationary Markov Policies

The standard Markov Decision Process (MDP) formulation hinges on the assumption that an action is executed immediately after it was chosen. However, assuming it is often unrealistic and can lead to catastrophic failures in applications such as robotic manipulation, cloud computing, and finance. We introduce a framework for learning and planning in MDPs where the decision-maker commits actions that are executed with a delay of $m$ steps. The brute-force state augmentation baseline where the state is concatenated to the last $m$ committed actions suffers from an exponential complexity in $m$, as we show for policy iteration. We then prove that with execution delay, Markov policies in the original state-space are sufficient for attaining maximal reward, but need to be non-stationary. As for stationary Markov policies, we show they are sub-optimal in general. Consequently, we devise a non-stationary Q-learning style model-based algorithm that solves delayed execution tasks without resorting to state-augmentation. Experiments on tabular, physical, and Atari domains reveal that it converges quickly to high performance even for substantial delays, while standard approaches that either ignore the delay or rely on state-augmentation struggle or fail due to divergence. The code is available at https://github.com/galdl/rl_delay_basic.git.

conference paper, iclr 2021, ps 0, (16 more...)

2101.11992

Country: North America > United States > Massachusetts > Middlesex County > Belmont (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

O'Callaghan, David, Mannion, Patrick

Exploring the Impact of Tunable Agents in Sequential Social Dilemmas

When developing reinforcement learning agents, the standard approach is to train an agent to converge to a fixed policy that is as close to optimal as possible for a single fixed reward function. If different agent behaviour is required in the future, an agent trained in this way must normally be either fully or partially retrained, wasting valuable time and resources. In this study, we leverage multi-objective reinforcement learning to create tunable agents, i.e. agents that can adopt a range of different behaviours according to the designer's preferences, without the need for retraining. We apply this technique to sequential social dilemmas, settings where there is inherent tension between individual and collective rationality. Learning a single fixed policy in such settings leaves one at a significant disadvantage if the opponents' strategies change after learning is complete. In our work, we demonstrate empirically that the tunable agents framework allows easy adaption between cooperative and competitive behaviours in sequential social dilemmas without the need for retraining, allowing a single trained agent model to be adjusted to cater for a wide range of behaviours and opponent strategies.

agent, predator, tunable agent, (14 more...)

2101.11967

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > Ireland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.49)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Hadoux, Emmanuel, Hunter, Anthony, Polberg, Sylwia

Strategic Argumentation Dialogues for Persuasion: Framework and Experiments Based on Modelling the Beliefs and Concerns of the Persuadee

Persuasion is an important and yet complex aspect of human intelligence. When undertaken through dialogue, the deployment of good arguments, and therefore counterarguments, clearly has a significant effect on the ability to be successful in persuasion. Two key dimensions for determining whether an argument is good in a particular dialogue are the degree to which the intended audience believes the argument and counterarguments, and the impact that the argument has on the concerns of the intended audience. In this paper, we present a framework for modelling persuadees in terms of their beliefs and concerns, and for harnessing these models in optimizing the choice of move in persuasion dialogues. Our approach is based on the Monte Carlo Tree Search which allows optimization in real-time. We provide empirical results of a study with human participants showing that our automated persuasion system based on this technology is superior to a baseline system that does not take the beliefs and concerns into account in its strategy.

argument, dialogue, participant, (15 more...)

2101.1187

Country:

North America > United States > Virginia (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Middle East > Malta > Port Region > Southern Harbour District > Floriana (0.04)
Africa > Mali (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Consumer Health (0.92)
Education (0.68)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)