Goto

Collaborating Authors

 Gretton, Charles


Computer Assisted Composition in Continuous Time

arXiv.org Machine Learning

We address the problem of combining sequence models of symbolic music with user defined constraints. For typical models this is nontrivial as only the conditional distribu - tion of each symbol given the earlier symbols is available, while the constraints correspond to arbitrary times. Previ - ously this has been addressed by assuming a discrete time model of fixed rhythm. We generalise to continuous time and arbitrary rhythm by introducing a simple, novel, and efficie nt particle filter scheme, applicable to general continuous ti me point processes. Extensive experimental evaluations demo n-strate that in comparison with a more traditional beam searc h baseline, the particle filter exhibits superior statistica l properties and yields more agreeable results in an extensive human listening test experiment.


A TIL-Relaxed Heuristic for Planning with Time Windows

AAAI Conferences

We consider planning problems with time windows, in which the availability of discrete resources is time constrained. We develop a novel heuristic that addresses specifically the difficulty of coordinating actions within time windows. The heuristic is based on solving a temporally relaxed problem and measuring the magnitude by which the relaxed solution violates the time window constraints. Applied in a state-space search planner, the heuristic reduces the number of dead-ends encountered during search, and improves planner coverage.


A State-Space Acyclicity Property for Exponentially Tighter Plan Length Bounds

AAAI Conferences

We investigate compositional bounding of transition system diameters, with application in bounding the lengths of plans. We establish usefully-tight bounds by exploiting acyclicity in state-spaces. We provide mechanised proofs in HOL4 of the validity of our approach. Evaluating our bounds in a range of benchmarks, we demonstrate exponentially tighter upper bounds compared to existing methods. Treating both solvable and unsolvable benchmark problems, we also demonstrate the utility of our bounds in boosting planner performance. We enhance an existing planning procedure to use our bounds, and demonstrate significant coverage improvements, both compared to the base planner, and also in comparisons with state-of-the-art systems.


A Study of Proxies for Shapley Allocations of Transport Costs

AAAI Conferences

We propose and evaluate a number of solutions to the problem of calculating the cost to serve each location in a single-vehicle transport setting. Such cost to serve analysis has application both strategically and operationally in transportation. The problem is formally given by the traveling salesperson game (TSG), a cooperative total utility game in which agents correspond to locations in a travelling salesperson problem (TSP). The cost to serve a location is an allocated portion of the cost of an optimal tour. The Shapley value is one of the most important normative division schemes in cooperative games, giving a principled and fair allocation both for the TSG and more generally. We consider a number of direct and sampling-based procedures for calculating the Shapley value, and present the first proof that approximating the Shapley value of the TSG within a constant factor is NP-hard. Treating the Shapley value as an ideal baseline allocation, we then develop six proxies for that value which are relatively easy to compute. We perform an experimental evaluation using Synthetic Euclidean games as well as games derived from real-world tours calculated for fast-moving consumer goods scenarios. Our experiments show that several computationally tractable allocation techniques correspond to good proxies for the Shapley value.


Implementation and Comparison of Solution Methods for Decision Processes with Non-Markovian Rewards

arXiv.org Artificial Intelligence

This paper examines a number of solution methods for decision processes with non-Markovian rewards (NMRDPs). They all exploit a temporal logic specification of the reward function to automatically translate the NMRDP into an equivalent Markov decision process (MDP) amenable to well-known MDP solution methods. They differ however in the representation of the target MDP and the class of MDP solution methods to which they are suited. As a result, they adopt different temporal logics and different translations. Unfortunately, no implementation of these methods nor experimental let alone comparative results have ever been reported. This paper is the first step towards filling this gap. We describe an integrated system for solving NMRDPs which implements these methods and several variants under a common interface; we use it to compare the various approaches and identify the problem features favoring one over the other.


Exploiting First-Order Regression in Inductive Policy Selection

arXiv.org Artificial Intelligence

We consider the problem of computing optimal generalised policies for relational Markov decision processes. We describe an approach combining some of the benefits of purely inductive techniques with those of symbolic dynamic programming methods. The latter reason about the optimal value function using first-order decision theoretic regression and formula rewriting, while the former, when provided with a suitable hypotheses language, are capable of generalising value functions or policies for small instances. Our idea is to use reasoning and in particular classical first-order regression to automatically generate a hypotheses language dedicated to the domain at hand, which is then used as input by an inductive solver. This approach avoids the more complex reasoning of symbolic dynamic programming while focusing the inductive solver's attention on concepts that are specifically relevant to the optimal value function for the domain considered.


A Switching Planner for Combined Task and Observation Planning

AAAI Conferences

From an automated planning perspective the problem of practical mobile robot control in realistic environments poses many important and contrary challenges. On the one hand, the planning process must be lightweight, robust, and timely. Over the lifetime of the robot it must always respond quickly with new plans that accommodate exogenous events, changing objectives, and the underlying unpredictability of the environment. On the other hand, in order to promote efficient behaviours the planning process must perform computationally expensive reasoning about contingencies and possible revisions of subjective beliefs according to quantitatively modelled uncertainty in acting and sensing. Towards addressing these challenges, we develop a continual planning approach that switches between using a fast satisficing "classical" planner, to decide on the overall strategy, and decision-theoretic planning to solve small abstract subproblems where deeper consideration of the sensing model is both practical, and can significantly impact overall performance. We evaluate our approach in large problems from a realistic robot exploration domain.


SAT-Based Parallel Planning Using a Split Representation of Actions

AAAI Conferences

Planning based on propositional SAT(isfiability) is a powerful approach to computing step-optimal plans given a parallel execution semantics. In this setting: (i) a solution plan must be minimal in the number of plan steps required, and (ii) non-conflicting actions can be executed instantaneously in parallel at a plan step. Underlying SAT-based approaches is the invocation of a decision procedure on a SAT encoding of a bounded version of the problem. A fundamental limitation of existing approaches is the size of these encodings. This problem stems from the use of a direct representation of actions โ€” i.e. each action has a corresponding variable in the encoding. A longtime goal in planning has been to mitigate this limitation by developing a more compact split โ€” also termed lifted โ€” representation of actions in SAT encodings of parallel step-optimal problems. This paper describes such a representation. In particular, each action and each parallel execution of actions is represented uniquely as a conjunct of variables. Here, each variable is derived from action pre and post- conditions . Because multiple actions share conditions , our encoding of the planning constraints is factored and relatively compact. We find experimentally that our encoding yields a much more efficient and scalable planning procedure over the state-of-the-art in a large set of planning benchmarks.