Goto

Collaborating Authors

 Country


Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs

AAAI Conferences

The Markov Decision Process (MDP) framework is a versatile method for addressing single and multiagent sequential decision making problems. Many exact and approximate solution methods attempt to exploit structure in the problem and are based on value factorization. Especially multiagent settings (MAS), however, are known to suffer from an exponential increase in value component sizes as interactions become denser, meaning that approximation architectures are overly restricted in the problem sizes and types they can handle. We present an approach to mitigate this limitation for certain types of MASs, exploiting a property that can be thought of as "anonymous influence" in the factored MDP. In particular, we show how anonymity can lead to representational and computational efficiencies, both for general variable elimination in a factor graph but also for the approximate linear programming solution to factored MDPs. The latter allows to scale linear programming to factored MDPs that were previously unsolvable. Our results are shown for a disease control domain over a graph with 50 nodes that are each connected with up to 15 neighbors.


Kognit: Intelligent Cognitive Enhancement Technology by Cognitive Models and Mixed Reality for Dementia Patients

AAAI Conferences

With advancements in technology, smartphones can already serve as memory aids. Electronic calendars are of great use in time-based memory tasks. In this project, we enter the mixed reality realm for helping dementia patients. Dementia is a general term for a decline in mental ability severe enough to interfere with daily life. Memory loss is an example. Here, mixed reality refers to the merging of real and virtual worlds to produce new episodic memory visualisations where physical and digital objects co-exist and interact in real-time. Cognitive models are approximations of a patient's mental abilities and limitations involving conscious mental activities (such as thinking, understanding, learning, and remembering). External representations of episodic memory help patients and caregivers coordinate their actions with one another. We advocate distributed cognition, which involves the coordination between individuals, artefacts and the environment, in four main implementations of artificial intelligence technology in the Kognit storyboard: (1) speech dialogue and episodic memory retrieval; (2) monitoring medication management and tracking an elder's behaviour (e.g., drinking water); (3) eye tracking and modelling cognitive abilities; and (4) serious game development towards active memory training. We discuss the storyboard, use cases and usage scenarios, and some implementation details of cognitive models and mixed reality hardware for the patient. The purpose of future studies is to determine the extent to which cognitive enhancement technology can be used to decrease caregiver burden.


Nested Value Iteration for Partially Satisfiable Co-Safe LTL Specifications (Extended Abstract)

AAAI Conferences

We describe our recent work on cost-optimal policy generation, for co-safe linear temporal logic (LTL) specifications that are not satisfiable with probability one in a Markov decision process (MDP) model. We provide an overview of the approach to pose the problem as the optimisation of three standard objectives in a trimmed product MDP. Furthermore, we introduce a new approach for optimising the three objectives, in a decreasing order of priority, based on a โ€œnestedโ€ value iteration, where one value table is kept for each objective.


MDPVIS: An Interactive Visualization for Testing Markov Decision Processes

AAAI Conferences

Whereas computational steering traditionally A common approach for solving Markov Decision Processes refers to modifying a computer process during its execution is to implement a simulator of the stochastic dynamics of (Mulder, van Wijk, and van Liere 1999), we treat optimization the MDP and a Monte Carlo optimization algorithm that invokes as an open-ended process whose parameters are repeatedly this simulator. The resulting software system is often changed for testing and debugging.


Modeling Motivational States for Adaptive Robot Companions

AAAI Conferences

Motivation impacts peopleโ€™s lives in a powerful way and is at the heart of a plethora of day-to-day activities and achievement settings, from success at the workplace to learning and acquiring knowledge to trying to quit bad habits. The current work aims to develop an adaptive robot companion that models a userโ€™s daily motivational state and chooses appropriate motivational strategies to keep the user on track for achieving a daily goal. The two main components we are focusing on in this context are creating an ontology-based user model of the personโ€™s motivational states and using an appropriate strategy selection algorithm that chooses the best motivational strategies for the user each day based on the user modelโ€™s output. Specifically, we are focusing on the important application domain of physical activity and aim to help early adolescents achieve daily-recommended levels of physical activity. Our human-robot interaction system uses information acquired from the user to feed the user model and physical activity data from a wristband device to inform the strategy selection algorithm.


Revisiting Multi-Objective MDPs with Relaxed Lexicographic Preferences

AAAI Conferences

We consider stochastic planning problems that involve multiple objectives such as minimizing task completion time and energy consumption. These problems can be modeled as multi-objective Markov decision processes (MOMDPs), an extension of the widely-used MDP model to handle problems involving multiple value functions. We focus on a subclass of MOMDPs in which the objectives have a {\em relaxed lexicographic structure}, allowing an agent to seek improvement in a lower-priority objective when the impact on a higher-priority objective is within some small given tolerance. We examine the relationship between this class of problems and {\em constrained MDPs}, showing that the latter offer an alternative solution method with strong guarantees. We show empirically that a recently introduced algorithm for MOMDPs may not offer the same strong guarantees, but it does perform well in practice.


Uninformed-to-Informed Exploration in Unstructured Real-World Environments

AAAI Conferences

Conventionally, the process of learning the model (exploration) is initialized as either an uninformed or informed policy, where the latter leverages observations to guide future exploration. Informed exploration is ideal as it may allow a model to be learned in fewer samples. However, informed exploration cannot be implemented from the onset when a-priori knowledge on the sensing domain statistics are not available; such policies would only sample the first set of locations, repeatedly. Hence, we present a theoretically-derived bound for transitioning from uninformed exploration to informed exploration for unstructured real-world environments which may be partially-observable and time-varying. This bound is used in tandem with a sparsified Bayesian nonparametric Poisson Exposure Process, which is used to learn to predict the value of information in partiallyobservable and time-varying domains. The result is an uninformed-to-informed exploration policy which outperforms baseline algorithms in real-world data-sets.


Self-Confidence of Autonomous Systems in a Military Environment

AAAI Conferences

The topic of the self-confidence of autonomous systems is discussed from the perspective of its use in a military environment. The concepts of autonomy and self-confidence are quite different in a military environment from the civilian environment. The militaryโ€™s recruit indoctrination provided a basis for the concept, the factors affecting the concept, and its measurement and communication. These and other aspects of the topic self-confidence in autonomous systems are discussed along with examples based on current research on the interface between human operators and such systems.


Minecraft as an Experimental World for AI in Robotics

AAAI Conferences

Performing experimental research on robotic platforms involves numerous practical complications, while studying collaborative interactions and efficiently collecting data from humans benefit from real time response. Roboticists can circumvent some complications by using simulators like Gazebo to test algorithms and building games like the Mars Escape game to collect data. Making use of existing resources for simulation and game creation requires the development of assets and algorithms along with the recruitment and training of users. We have created a Minecraft mod called BurlapCraft which enables the use of the reinforcement learning and planning library BURLAP to model and solve different tasks within Minecraft. BurlapCraft makes AI-HRI development easier in three core ways: the underlying Minecraft environment makes the construction of experiments simple for the developer and so allows the rapid prototyping of experimental setup; BURLAP contributes a wide variety of extensible algorithms for learning and planning, allowing easy iteration and development of task models and algorithms; and the familiarity and ubiquity of Minecraft trivializes the recruitment and training of users. To validate BurlapCraft as a platform for AI development, we demonstrate the execution of A*, BFS, RMax, language understanding, and learning language groundings from user demonstrations in five Minecraft "dungeons."


Autonomous Electricity Trading Using Time-Of-Use Tariffs in a Competitive Market

AAAI Conferences

This research studies the impact of Time-Of-Use (TOU) tariffs in a competitive electricity market place. Specifically, it focuses on the question of how should an autonomous broker agent optimize TOU tariffs in a competitive retail market, and what is the impact of such tariffs on the economy. We formalize the problem of TOU tariff optimization and propose an algorithm for approximating its solution. We extensively experiment with our algorithm in a large-scale, detailed electricity retail markets simulation of the Power Trading Agent Competition (Power TAC) and: 1) find that our algorithm results in 15\% peak-demand reduction, 2) find that its peak-flattening results in greater profits and/or profit-share for the broker and allows it to win in head-to-head competition against the 1st and 2nd place brokers from the Power TAC 2014 finals, and 3) analyze several economic implications of using TOU tariffs in competitive retail markets.