Plotting

 Zilberstein, Shlomo


Privacy-Preserving Policy Iteration for Decentralized POMDPs

AAAI Conferences

We propose the first privacy-preserving approach to address the privacy issues that arise in multi-agent planning problems modeled as a Dec-POMDP. Our solution is a distributed message-passing algorithm based on trials, where the agents' policies are optimized using the cross-entropy method. In our algorithm, the agents' private information is protected using a public-key homomorphic cryptosystem. We prove the correctness of our algorithm and analyze its complexity in terms of message passing and encryption/decryption operations. Furthermore, we analyze several privacy aspects of our algorithm and show that it can preserve the agent privacy of non-neighbors, model privacy, and decision privacy. Our experimental results on several common Dec-POMDP benchmark problems confirm the effectiveness of our approach.


Does the Human's Representation Matter for Unsupervised Activity Recognition?

AAAI Conferences

Unsupervised activity recognition introduces the opportunity for more robust interaction experiences with machines because the human is not limited to only acting with respect to a training dataset. Many approaches currently use latent variable models that have been well studied and developed by the natural language research communities. However, these models are simply used as-is or with minor tweaks on datasets that present an analogy between sensor reading sequences and text documents. Although words have well-defined semantics so that the learned clusters can be interpreted and verified, this is not often the case for sensor readings. For example, novel data from new human activities need to be classified, which relies on the learned clusters; so how does one confirm that new activities are being correctly processed by a robot for interaction? We present several ways that motion capture information can be represented for use in these methods, and then illustrate how the representation choice has the potential to produce variations in the learned clusters.


Fast SSP Solvers Using Short-Sighted Labeling

AAAI Conferences

State-of-the-art methods for solving SSPs often work by limiting planning to restricted regions of the state space. The resulting problems can then be solved quickly, and the process is repeated during execution when states outside the restricted region are encountered. Typically, these approaches focus on states that are within some distance measure of the start state (e.g., number of actions or probability of being reached). However, these short-sighted approaches make it difficult to propagate information from states that are closer to a goal than to the start state, thus missing opportunities to improve planning. We present an alternative approach in which short-sightedness is used only to determine whether a state should be labeled as solved or not, but otherwise the set of states that can be accounted for during planning is unrestricted. Based on this idea, we propose the FLARES algorithm and show that it performs consistently well on a wide range of benchmark problems.


Integration of Planning with Recognition for Responsive Interaction Using Classical Planners

AAAI Conferences

Interaction between multiple agents requires some form of coordination and a level of mutual awareness. When computers and robots interact with people, they need to recognize human plans and react appropriately. Plan and goal recognition techniques have focused on identifying an agent's task given a sufficiently long action sequence. However, by the time the plan and/or goal are recognized, it may be too late for computing an interactive response. We propose an integration of planning with probabilistic recognition where each method uses intermediate results from the other as a guiding heuristic for recognition of the plan/goal in-progress as well as the interactive response. We show that, like the used recognition method, these interaction problems can be compiled into classical planning problems and solved using off-the-shelf methods. In addition to the methodology, this paper introduces problem categories for different forms of interaction, an evaluation metric for the benefits from the interaction, and extensions to the recognition algorithm that make its intermediate results more practical while the plan is in progress.


Robust Optimization for Tree-Structured Stochastic Network Design

AAAI Conferences

Stochastic network design is a general framework for optimizing network connectivity. It has several applications in computational sustainability including spatial conservation planning, pre-disaster network preparation, and river network optimization. A common assumption in previous work has been made that network parameters (e.g., probability of species colonization) are precisely known, which is unrealistic in real- world settings. We therefore address the robust river network design problem where the goal is to optimize river connectivity for fish movement by removing barriers. We assume that fish passability probabilities are known only imprecisely, but are within some interval bounds. We then develop a planning approach that computes the policies with either high robust ratio or low regret. Empirically, our approach scales well to large river networks. We also provide insights into the solutions generated by our robust approach, which has significantly higher robust ratio than the baseline solution with mean parameter estimates.


Redesigning Stochastic Environments for Maximized Utility

AAAI Conferences

We present the Utility Maximizing Design (UMD) model for optimally redesigning stochastic environments to achieve maximized performance. This model suits well contemporary applications that involve the design of environments where robots and humans co-exist an co-operate, e.g., vacuum cleaning robot. We discuss two special cases of the UMD model. The first is the equi-reward UMD (ER-UMD) in which the agents and the system share a utility function, such as for the vacuum cleaning robot. The second is the goal recognition design (GRD) setting, discussed in the literature, in which system and agent utilities are independent. To find the set of optimal modifications to apply to a UMD model, we present a generic method, based on heuristic search. After specifying the conditions for optimality in the general case, we present an admissible heuristic for the ER-UMD case. We also present a novel compilation that embeds the redesign process into a planning problem, allowing use of any off-the-shelf solver to find the best way to modify an environment when a design budget is specified. Our evaluation shows the feasibility of the approach using standard benchmarks from the probabilistic planning competition.


Safety in AI-HRI: Challenges Complementing User Experience Quality

AAAI Conferences

Contemporary research in human-robot interaction (HRI) predominantly focuses on the user's experience while controlling a robot. However, with the increased deployment of artificial intelligence (AI) techniques, robots are quickly becoming more autonomous in both academic and industrial experimental settings. In addition to improving the user's interactive experience with AI-operated robots through personalization, dialogue, emotions, and dynamic behavior, there is also a growing need to consider the safety of the interaction. AI may not account for the user's less likely responses, making it possible for an unaware user to be injured by the robot if they have a collision. Issues of trust and acceptance may also come into play if users cannot always understand the robot's thought process, creating a potential for emotional harm. We identify challenges that will need to be addressed in safe AI-HRI and provide an overview of approaches to consider for them, many stemming from the contemporary research.


Optimizing Resilience in Large Scale Networks

AAAI Conferences

We propose a decision making framework to optimize the resilience of road networks to natural disasters such as floods. Our model generalizes an existing one for this problem by allowing roads with a broad class of stochastic delay models. We then present a fast algorithm based on the sample average approximation (SAA) method and network design techniques to solve this problem approximately. On a small existing benchmark, our algorithm produces near-optimal solutions and the SAA method converges quickly with a small number of samples. We then apply our algorithm to a large real-world problem to optimize the resilience of a road network to failures of stream crossing structures to minimize travel times of emergency medical service vehicles. On medium-sized networks, our algorithm obtains solutions of comparable quality to a greedy baseline method but is 30โ€“60 times faster. Our algorithm is the only existing algorithm that can scale to the full network, which has many thousands of edges.


A Parallel Point-Based POMDP Algorithm Leveraging GPUs

AAAI Conferences

We parallelize the Point-Based Value Iteration (PBVI) algorithm, which approximates the solution to Partially Observable Markov Decision Processes (POMDPs), using a Graphics Processing Unit (GPU). We detail additional optimizations, such as leveraging the bounded size of non-zero values over all belief point vectors, usable by serial and parallel algorithms. We compare serial (CPU) and parallel (GPU) implementations on 10 distinct problem domains, and demonstrate that our approach provides an order of magnitude improvement.


Temporal and Object Relations in Unsupervised Plan and Activity Recognition

AAAI Conferences

We consider ways to improve the performance of unsupervised plan and activity recognition techniques by considering temporal and object relations in addition to postural data. Temporal relationships can help recognize activities with cyclic structure and are often implicit because plans have degrees of ordering actions. Relations with objects can help disambiguate observed activities that otherwise share a user's posture and position. We develop and investigate graphical models that extend the popular latent Dirichlet allocation approach with temporal and object relations, examine the relative performance and runtime trade-offs using a standard dataset, and consider the cost/benefit trade-offs these extensions offer in the context of human-robot and humancomputer interaction.