Goto

Collaborating Authors

 Learning Graphical Models


Statistical Decision Making for Authentication and Intrusion Detection

arXiv.org Machine Learning

Classification is the problem of categorising data in one of two or more possible classes. In the classical supervised learning framework, examples of each class have already been obtained and the task of the decision maker is to accurately categorise new observations, whose class is unknown. The accuracy is either measured in terms of the rate of misclassification, or in terms of the average cost, for problems where different types of errors carry different costs. In that setting, the problem has three phases: (a) the collection of training data, (b) the estimation of a decision rule based on the training data and (c) the application 1 of the decision rule to new data. Typically, the decision rule remains fixed after the second step.


Discrete MDL Predicts in Total Variation

arXiv.org Machine Learning

The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationarity, identifiability, or other assumption on the model class need to be made. More formally, we show that for any countable class of models, the distributions selected by MDL (or MAP) asymptotically predict (merge with) the true measure in the class in total variation distance. Implications for non-i.i.d. domains like time-series forecasting, discriminative learning, and reinforcement learning are discussed.


Multi-Agent Online Planning with Communication

AAAI Conferences

We propose an online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs. The algorithm helps overcome the high computational complexity of solving such problems off-line. The key challenge is to produce coordinated behavior using little or no communication. When communication is allowed but constrained, the challenge is to produce high value with minimal communication. The algorithm addresses these challenges by communicating only when history inconsistency is detected, allowing communication to be postponed if necessary. Moreover, it bounds the memory usage at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing off-line planning algorithms and it outperforms the best online method, producing higher value with much less communication in most cases.


Exploiting Coordination Locales in Distributed POMDPs via Social Model Shaping

AAAI Conferences

Distributed POMDPs provide an expressive framework for modeling multiagent collaboration problems, but NEXP-Complete complexity hinders their scalability and application in real-world domains. This paper introduces a subclass of distributed POMDPs, and TREMOR, an algorithm to solve such distributed POMDPs. The primary novelty of TREMOR is that agents plan individually with a single agent POMDP solver and use social model shaping to implicitly coordinate with other agents. Experiments demonstrate that TREMOR can provide solutions orders of magnitude faster than existing algorithms while achieving comparable, or even superior, solution quality.


A Decision-Theoretic Approach to Dynamic Sensor Selection in Camera Networks

AAAI Conferences

Nowadays many urban areas have been equipped with networks of surveillance cameras, which can be used for automatic localization and tracking of people. However, given the large resource demands of imaging sensors in terms of bandwidth and computing power, processing the image streams of all cameras simultaneously might not be feasible. In this paper, we consider the problem of dynamical sensor selection based on user-defined objectives, such as maximizing coverage or improved localization uncertainty.  We propose a decision-theoretic approach modeled as a POMDP, which selects k sensors to consider in the next time frame, incorporating all observations made in the past. We show how, by changing the POMDP's reward function, we can change the system's behavior in a straightforward manner, fulfilling the user's chosen objective. We successfully apply our techniques to a network of 10 cameras.


Navigation Planning in Probabilistic Roadmaps with Uncertainty

AAAI Conferences

Probabilistic Roadmaps (PRM) are a commonly used class of algorithms for robot navigation tasks where obstacles are present in the environment. We examine the situation where the obstacle positions are not precisely known. A subset of the edges in the PRM graph may possibly intersect the obstacles, and as the robot traverses the graph it can make noisy observations of these uncertain edges to determine if it can traverse them or not. The problem is to traverse the graph from an initial vertex to a goal without taking a blocked edge, and to do this optimally the robot needs to consider the observations it can make as well as the structure of the graph. In this paper we show how this problem can be represented as a POMDP. We show that while too large to be solved with exact methods, approximate point based methods can provide a good quality solution. While feasible for smaller examples, this approach isn't scalable. By exploiting the structure in the belief space, we can construct an approximate belief-space MDP that can be solved efficiently using recent techniques in MDP planning. We then demonstrate that this gives near optimal results in most cases while achieving an order of magnitude speed-up in policy generation time.


Minimal Sufficient Explanations for Factored Markov Decision Processes

AAAI Conferences

Explaining policies of Markov Decision Processes (MDPs) is complicated due to their probabilistic and sequential nature. We present a technique to explain policies for factored MDP by populating a set of domain-independent templates. We also present a mechanism to determine a minimal set of templates that, viewed together, completely justify the policy. Our explanations can be generated automatically at run-time with no additional effort required from the MDP designer. We demonstrate our technique using the problems of advising undergraduate students in their course selection and assisting people with dementia in completing the task of handwashing. We also evaluate our explanations for course-advising through a user study involving students.


Efficient Solutions to Factored MDPs with Imprecise Transition Probabilities

AAAI Conferences

When modeling real-world decision-theoretic planning problems in the Markov decision process (MDP) framework, it is often impossible to obtain a completely accurate estimate of transition probabilities. For example, natural uncertainty arises in the transition specification due to elicitation of MDP transition models from an expert or data, or non-stationary transition distributions arising from insufficient state knowledge. In the interest of obtaining the most robust policy under transition uncertainty, the Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) has been introduced to model such scenarios. Unfortunately, while solutions to the MDP-IP are well-known, they require nonlinear optimization and are extremely time-consuming in practice. To address this deficiency, we propose efficient dynamic programming methods to exploit the structure of factored MDPIPs. Noting that the key computational bottleneck in the solution of MDP-IPs is the need to repeatedly solve nonlinear constrained optimization problems, we show how to target approximation techniques to drastically reduce the computational overhead of the nonlinear solver while producing bounded, approximately optimal solutions. Our results show up to two orders of magnitude speedup in comparison to traditional “flat” dynamic programming approaches and up to an order of magnitude speedup over the extension of factored MDP approximate value iteration techniques to MDP-IPs.


Focused Topological Value Iteration

AAAI Conferences

Topological value iteration (TVI) is an effective algorithm for solving Markov decision processes (MDPs) optimally, which (1) divides an MDP into strongly-connected components, and (2) solves these components sequentially. Yet, TVI's usefulness tends to degrade if an MDP has large components, because the cost of the division process isn't offset by gains during solution.  This paper presents a new algorithm to solve MDPs optimally, focused  topological value iteration (FTVI). FTVI addresses TVI's limitations by restricting its attention to connected components that are relevant for solving the MDP. Specifically, FTVI uses a small amount of heuristic search to eliminate provably sub-optimal actions; this pruning allows FTVI to find smaller connected components, thus running faster.  We demonstrate that our new algorithm outperforms TVI by an order of magnitude, averaged across several domains. Surprisingly, FTVI also significantly outperforms popular "heuristically-informed" MDP algorithms such as LAO*, LRTDP, and BRTDP in many domains, sometimes by as much as two orders of magnitude. Finally, we characterize the type of domains where FTVI excels — suggesting a way to an informed choice of solver.


A Human-Aware Robot Task Planner

AAAI Conferences

The growing presence of household robots in inhabited environments arises the need for new robot task planning techniques. These techniques should take into consideration not only the actions that the robot can perform or unexpected external events, but also the actions performed by a human sharing the same environment, in order to improve the cohabitation of the two agents, e.g., by avoiding undesired situations for the human. In this paper, we present a human-aware planner able to address this problem. This planner supports alternative hypotheses of the human plan, temporal duration for the actions of both the robot and the human, constraints on the interaction between robot and human, partial goal achievement and, most importantly, the possibility to use observations of human actions in the policy generated for the robot. The planner has been tested as a standalone component and in conjunction with our framework for human-robot interaction in a real environment.