Goto

Collaborating Authors

 Agents


Learning Ontologies with Epistemic Reasoning: The EL Case

arXiv.org Artificial Intelligence

We investigate the problem of learning description logic ontologies from entailments via queries, using epistemic reasoning. We introduce a new learning model consisting of epistemic membership and example queries and show that polynomial learnability in this model coincides with polynomial learnability in Angluin's exact learning model with membership and equivalence queries. We then instantiate our learning framework to EL and show some complexity results for an epistemic extension of EL where epistemic operators can be applied over the axioms. Finally, we transfer known results for EL ontologies and its fragments to our learning model based on epistemic reasoning.


Ask Not What AI Can Do, But What AI Should Do: Towards a Framework of Task Delegability

arXiv.org Artificial Intelligence

Although artificial intelligence holds promise for addressing societal challenges, issues of exactly which tasks to automate and the extent to do so remain understudied. We approach the problem of task delegability from a human-centered perspective by developing a framework on human perception of task delegation to artificial intelligence. We consider four high-level factors that can contribute to a delegation decision: motivation, difficulty, risk, and trust. To obtain an empirical understanding of human preferences in different tasks, we build a dataset of 100 tasks from academic papers, popular media portrayal of AI, and everyday life. For each task, we administer a survey to collect judgments of each factor and ask subjects to pick the extent to which they prefer AI involvement. We find little preference for full AI control and a strong preference for machine-in-the-loop designs, in which humans play the leading role. Our framework can effectively predict human preferences in degrees of AI assistance. Among the four factors, trust is the most predictive of human preferences of optimal human-machine delegation. This framework represents a first step towards characterizing human preferences of automation across tasks. We hope this work may encourage and aid in future efforts towards understanding such individual attitudes; our goal is to inform the public and the AI research community rather than dictating any direction in technology development.


The Actor-Advisor: Policy Gradient With Off-Policy Advice

arXiv.org Artificial Intelligence

Actor-critic algorithms learn an explicit policy (actor), and an accompanying value function (critic). The actor performs actions in the environment, while the critic evaluates the actor's current policy. However, despite their stability and promising convergence properties, current actor-critic algorithms do not outperform critic-only ones in practice. We believe that the fact that the critic learns Q^pi, instead of the optimal Q-function Q*, prevents state-of-the-art robust and sample-efficient off-policy learning algorithms from being used. In this paper, we propose an elegant solution, the Actor-Advisor architecture, in which a Policy Gradient actor learns from unbiased Monte-Carlo returns, while being shaped (or advised) by the Softmax policy arising from an off-policy critic. The critic can be learned independently from the actor, using any state-of-the-art algorithm. Being advised by a high-quality critic, the actor quickly and robustly learns the task, while its use of the Monte-Carlo return helps overcome any bias the critic may have. In addition to a new Actor-Critic formulation, the Actor-Advisor, a method that allows an external advisory policy to shape a Policy Gradient actor, can be applied to many other domains. By varying the source of advice, we demonstrate the wide applicability of the Actor-Advisor to three other important subfields of RL: safe RL with backup policies, efficient leverage of domain knowledge, and transfer learning in RL. Our experimental results demonstrate the benefits of the Actor-Advisor compared to state-of-the-art actor-critic methods, illustrate its applicability to the three other application scenarios listed above, and show that many important challenges of RL can now be solved using a single elegant solution.


Agent-Based Adaptive Level Generation for Dynamic Difficulty Adjustment in Angry Birds

arXiv.org Artificial Intelligence

Section is a key area of investigation for video game research 2 describes the large amount of background and related (Hendrikx et al. 2013; Togelius et al. 2011). PLG work, both for Angry Birds and adaptive level generation in can be extremely useful for increasing a game's length and general. Section 3 presents our proposed adaptive generation replayability, as it allows a large number of levels to be created method. Section 4 describes our conducted experiments and in a relatively short time. It is also possible to tailor the results. Sections 5 discusses what these results could mean generated levels towards specific user's playstyles, known as for both human players and agents, Section 6 concludes this adaptive level generation, which allows for a unique and personalised work and outlines future possibilities.


Expressive mechanisms for equitable rent division on a budget

arXiv.org Artificial Intelligence

We achieve four objectives: (1) each agent is allowed to make a report that expresses her preference about violating her budget constraint, a feature not achieved by mechanisms that only elicit quasi-linear reports; (2) these reports are finite dimensional; (3) computation is feasible in polynomial time; and (4) incentive properties of envy-free mechanisms that elicit quasi-linear reports are preserved.


CESMA: Centralized Expert Supervises Multi-Agents

arXiv.org Artificial Intelligence

We consider the reinforcement learning problem of training multiple agents in order to maximize a shared reward. In this multi-agent system, each agent seeks to maximize the reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent observes a non-stationary and partially-observable environment. In order to resolve this issue, we demonstrate a novel multi-agent training framework that first turns a multi-agent problem into a single-agent problem to obtain a centralized expert that is then used to guide supervised learning for multiple independent agents with the goal of decentralizing the policy. We additionally demonstrate a way to turn the exponential growth in the joint action space into a linear growth for the centralized policy. Overall, the problem is twofold: the problem of obtaining a centralized expert, and then the problem of supervised learning to train the multi-agents. We demonstrate our solutions to both of these tasks, and show that supervised learning can be used to decentralize a multi-agent policy.


InfoBot: Transfer and Exploration via the Information Bottleneck

arXiv.org Machine Learning

A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned policy with an information bottleneck, we can identify decision states by examining where the model actually leverages the goal state. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.


Distilling Policy Distillation

arXiv.org Machine Learning

The transfer of knowledge from one policy to another is an important tool in Deep Reinforcement Learning. This process, referred to as distillation, has been used to great success, for example, by enhancing the optimisation of agents, leading to stronger performance faster, on harder domains [26, 32, 5, 8]. Despite the widespread use and conceptual simplicity of distillation, many different formulations are used in practice, and the subtle variations between them can often drastically change the performance and the resulting objective that is being optimised. In this work, we rigorously explore the entire landscape of policy distillation, comparing the motivations and strengths of each variant through theoretical and empirical analysis. Our results point to three distillation techniques, that are preferred depending on specifics of the task. Specifically a newly proposed expected entropy regularised distillation allows for quicker learning in a wide range of situations, while still guaranteeing convergence.


Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications

arXiv.org Machine Learning

Reinforcement learning (RL) algorithms have been around for decades and employed to solve various sequential decision-making problems. These algorithms however have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This paper addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multi-agent deep RL (MADRL) is presented, including non-stationarity, partial observability, continuous state and action spaces, multi-agent training schemes, multi-agent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed, with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to future development of more robust and highly useful multi-agent learning methods for solving real-world problems.


Situational Grounding within Multimodal Simulations

arXiv.org Artificial Intelligence

In this paper, we argue that simulation platforms enable a novel type of embodied spatial reasoning, one facilitated by a formal model of object and event semantics that renders the continuous quantitative search space of an open-world, real-time environment tractable. We provide examples for how a semantically-informed AI system can exploit the precise, numerical information provided by a game engine to perform qualitative reasoning about objects and events, facilitate learning novel concepts from data, and communicate with a human to improve its models and demonstrate its understanding. We argue that simulation environments, and game engines in particular, bring together many different notions of "simulation" and many different technologies to provide a highly-effective platform for developing both AI systems and tools to experiment in both machine and human intelligence.