Goto

Collaborating Authors

 Search


On Interruptible Pure Exploration in Multi-Armed Bandits

AAAI Conferences

Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate performance improvement over time. Our experimental evaluation demonstrates that the corresponding instances of DB favorably compete both with the currently popular strategies UCB1 and Epsilon-Greedy, as well as with the conservative uniform sampling.


Sharing Rides with Friends: A Coalition Formation Algorithm for Ridesharing

AAAI Conferences

We consider the Social Ridesharing (SR) problem, where a set of commuters, connected through a social network, arrange one-time rides at short notice. In particular, we focus on the associated optimisation problem of forming cars to minimise the travel cost of the overall system modelling such problem as a graph constrained coalition formation (GCCF) problem, where the set of feasible coalitions is restricted by a graph (i.e., the social network). Moreover, we significantly extend the state of the art algorithm for GCCF, i.e., the CFSS algorithm, to solve our GCCF model of the SR problem. Our empirical evaluation uses a real dataset for both spatial (GeoLife) and social data (Twitter), to validate the applicability of our approach in a realistic application scenario. Empirical results show that our approach computes optimal solutions for systems of medium scale (up to 100 agents) providing significant cost reductions (up to -36.22%). Moreover, we can provide approximate solutions for very large systems (i.e., up to 2000 agents) and good quality guarantees (i.e., with an approximation ratio of 1.41 in the worst case) within minutes (i.e., 100 seconds).


Lifting Model Sampling for General Game Playing to Incomplete-Information Models

AAAI Conferences

General Game Playing is the design of AI systems able to understand the rules of new games and to use such descriptions to play those games effectively. Games with incomplete information have recently been added as anew challenge for general game-playing systems. The only published solutions to this challenge are based on sampling complete information models. In doing so they ground all of the unknown information, thereby making information gathering moves of no value; a well-known criticism of such sampling based systems. We present and analyse a method for escalating reasoning from complete information models to incomplete information models and show how this enables a general game player to correctly value information in incomplete information games. Experimental results demonstrate the success of this technique over standard model sampling.


BDDs Strike Back (in AI Planning)

AAAI Conferences

The cost-optimal track of the international planning competition in 2014 has seen an unexpected outcome. Different to the precursing competition in 2011, where explicit-state heuristic search planning scored best, advances in the state-set exploration with BDDs showed a significant lead. In this paper we review the outcome of the competition, briefly looking into the internals of the competing systems.


Interactive Narrative Planning in The Best Laid Plans

AAAI Conferences

The Best Laid Plans is an interactive narrative video game that uses cognitive-inspired fast planning techniques to generate stories with conflict during play. Players alternate between acting out a plan and seeing that plan thwarted by non-player characters. The Glaive narrative planner combines causal-link-based computational models of narrative with the speed of fast heuristic search techniques to adapt the story each time the player attempts a new plan.


Non-Classical Planning for Robotic Applications

AAAI Conferences

For my dissertation I am focusing on non-classical planning for robotic applications. Much classical planning research relies on assumptions that do not hold in real world robotics applications. In many cases the entire world state is not known in advance and the events that occur in the future can not be known with certainty. Robots operating in the real world also need to be responsive and react to dynamic obstacles and events.


Planning with Numeric Timed Initial Fluents

AAAI Conferences

Numeric Timed Initial Fluents represent a new feature in PDDL that extends the concept of Timed Initial Literals to numeric fluents. They are particularly useful to model independent functions that change through time and influence the actions to be applied. Although they are very useful to model real world problems, they are not systematically defined in the family of PDDL languages and they are not implemented in any generic PDDL planner, except for POPF2 and UPMurphi. In this paper we present an extension of the planner POPF2 (POPF-TIF) to handle problems with numeric Timed Initial Fluents. We propose and evaluate two contributions: the first is based on improvements of the heuristic evaluation, while the second considers alternative search algorithms based on a mixture of Enforced Hill Climbing and Best First Search.


An Exact Algorithm for Solving Most Relevant Explanation in Bayesian Networks

AAAI Conferences

Most Relevant Explanation (MRE) is a new inference task in Bayesian networks that finds the most relevant partial instantiation of target variables as an explanation for given evidence by maximizing the Generalized Bayes Factor (GBF). No exact algorithm has been developed for solving MRE previously. This paper fills the void and introduces a breadth-first branch-and-bound MRE algorithm based on a novel upper bound on GBF. The bound is calculated by decomposing the computation of the score to a set of Markov blankets of subsets of evidence variables. Our empirical evaluations show that the proposed algorithm scales up exact MRE inference significantly.


Just Count the Satisfied Groundings: Scalable Local-Search and Sampling Based Inference in MLNs

AAAI Conferences

The main computational bottleneck in various sampling based and local-search based inference algorithms for Markov logic networks (e.g., Gibbs sampling, MC-SAT, MaxWalksat, etc.) is computing the number of groundings of a first-order formula that are true given a truth assignment to all of its ground atoms. We reduce this problem to the problem of counting the number of solutions of a constraint satisfaction problem (CSP) and show that during their execution, both sampling based and local-search based algorithms repeatedly solve dynamic versions of this counting problem. Deriving from the vast amount of literature on CSPs and graphical models, we propose an exact junction-tree based algorithm for computing the number of solutions of the dynamic CSP, analyze its properties, and show how it can be used to improve the computational complexity of Gibbs sampling and MaxWalksat. Empirical tests on a variety of benchmarks clearly show that our new approach is several orders of magnitude more scalable than existing approaches.


Tighter Value Function Bounds for Bayesian Reinforcement Learning

AAAI Conferences

Bayesian reinforcement learning (BRL) provides a principled framework for optimal exploration-exploitation tradeoff in reinforcement learning. We focus on model based BRL, which involves a compact formulation of the optimal tradeoff from the Bayesian perspective. However, it still remains a computational challenge to compute the Bayes-optimal policy. In this paper, we propose a novel approach to compute tighter value function bounds of the Bayes-optimal value function, which is crucial for improving the performance of many model-based BRL algorithms. We then present how our bounds can be integrated into real-time AO* heuristic search, and provide a theoretical analysis on the impact of improved bounds on the search efficiency. We also provide empirical results on standard BRL domains that demonstrate the effectiveness of our approach.