Asia
Policy Iteration for Decentralized Control of Markov Decision Processes
Bernstein, D. S., Amato, C., Hansen, E. A., Zilberstein, S.
Coordination of distributed agents is required for problems arising in many areas, including multi-robot systems, networking and e-commerce. As a formal framework for such problems, we use the decentralized partially observable Markov decision process (DEC-POMDP). Though much work has been done on optimal dynamic programming algorithms for the single-agent version of the problem, optimal algorithms for the multiagent case have been elusive. The main contribution of this paper is an optimal policy iteration algorithm for solving DEC-POMDPs. The algorithm uses stochastic finite-state controllers to represent policies. The solution can include a correlation device, which allows agents to correlate their actions without communicating. This approach alternates between expanding the controller and performing value-preserving transformations, which modify the controller without sacrificing value. We present two efficient value-preserving transformations: one can reduce the size of the controller and the other can improve its value while keeping the size fixed. Empirical results demonstrate the usefulness of value-preserving transformations in increasing value while keeping controller size to a minimum. To broaden the applicability of the approach, we also present a heuristic version of the policy iteration algorithm, which sacrifices convergence to optimality. This algorithm further reduces the size of the controllers at each step by assuming that probability distributions over the other agents' actions are known. While this assumption may not hold in general, it helps produce higher quality solutions in our test problems.
Syntactic Confluence Criteria for Positive/Negative-Conditional Term Rewriting Systems
We study the combination of the following already known ideas for showing confluence of unconditional or conditional term rewriting systems into practically more useful confluence criteria for conditional systems: Our syntactical separation into constructor and non-constructor symbols, Huet's introduction and Toyama's generalization of parallel closedness for non-noetherian unconditional systems, the use of shallow confluence for proving confluence of noetherian and non-noetherian conditional systems, the idea that certain kinds of limited confluence can be assumed for checking the fulfilledness or infeasibility of the conditions of conditional critical pairs, and the idea that (when termination is given) only prime superpositions have to be considered and certain normalization restrictions can be applied for the substitutions fulfilling the conditions of conditional critical pairs. Besides combining and improving already known methods, we present the following new ideas and results: We strengthen the criterion for overlay joinable noetherian systems, and, by using the expressiveness of our syntactical separation into constructor and non-constructor symbols, we are able to present criteria for level confluence that are not criteria for shallow confluence actually and also able to weaken the severe requirement of normality (stiffened with left-linearity) in the criteria for shallow confluence of noetherian and non-noetherian conditional systems to the easily satisfied requirement of quasi-normality. Finally, the whole paper may also give a practically useful overview of the syntactical means for showing confluence of conditional term rewriting systems.
Asynchronous Forward Bounding for Distributed COPs
Gershman, A., Meisels, A., Zivan, R.
A new search algorithm for solving distributed constraint optimization problems (DisCOPs) is presented. Agents assign variables sequentially and compute bounds on partial assignments asynchronously. The asynchronous bounds computation is based on the propagation of partial assignments. The asynchronous forward-bounding algorithm (AFB) is a distributed optimization search algorithm that keeps one consistent partial assignment at all times. The algorithm is described in detail and its correctness proven. Experimental evaluation shows that AFB outperforms synchronous branch and bound by many orders of magnitude, and produces a phase transition as the tightness of the problem increases. This is an analogous effect to the phase transition that has been observed when local consistency maintenance is applied to MaxCSPs. The AFB algorithm is further enhanced by the addition of a backjumping mechanism, resulting in the AFB-BJ algorithm. Distributed backjumping is based on accumulated information on bounds of all values and on processing concurrently a queue of candidate goals for the next move back. The AFB-BJ algorithm is compared experimentally to other DisCOP algorithms (ADOPT, DPOP, OptAPO) and is shown to be a very efficient algorithm for DisCOPs.
Preferences in Interactive Systems: Technical Challenges and Case Studies
Peintner, Bart (SRI International) | Viappiani, Paolo (University of Toronto) | Yorke-Smith, Neil (SRI International)
Interactive artificial intelligence systems employ preferences in both their reasoning and their interaction with the user. This survey considers preference handling in applications such as recommender systems, personal assistant agents, and personalized user interfaces. We survey the major questions and approaches, present illustrative examples, and give an outlook on potential benefits and challenges.
Reflective visualization and verbalization of unconscious preference
Maeno, Yoshiharu, Ohsawa, Yukio
A new method is presented, that can help a person become aware of his or her unconscious preferences, and convey them to others in the form of verbal explanation. The method combines the concepts of reflection, visualization, and verbalization. The method was tested in an experiment where the unconscious preferences of the subjects for various artworks were investigated. In the experiment, two lessons were learned. The first is that it helps the subjects become aware of their unconscious preferences to verbalize weak preferences as compared with strong preferences through discussion over preference diagrams. The second is that it is effective to introduce an adjustable factor into visualization to adapt to the differences in the subjects and to foster their mutual understanding.
A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains
Meuleau, N., Benazera, E., Brafman, R. I., Hansen, E. A., Mausam,
We consider the problem of optimal planning in stochastic domains with resource constraints, where the resources are continuous and the choice of action at each step depends on resource availability. We introduce the HAO* algorithm, a generalization of the AO* algorithm that performs search in a hybrid state space that is modeled using both discrete and continuous state variables, where the continuous variables represent monotonic resources. Like other heuristic search algorithms, HAO* leverages knowledge of the start state and an admissible heuristic to focus computational effort on those parts of the state space that could be reached from the start state by following an optimal policy. We show that this approach is especially effective when resource constraints limit how much of the state space is reachable. Experimental results demonstrate its effectiveness in the domain that motivates our research: automated planning for planetary exploration rovers.
Feature Selection Methods for Improving Protein Structure Prediction with Rosetta
Blum, Ben, Baker, David, Jordan, Michael I., Bradley, Philip, Das, Rhiju, Kim, David E.
Rosetta is one of the leading algorithms for protein structure prediction today. It is a Monte Carlo energy minimization method requiring many random restarts to find structures with low energy. In this paper we present a resampling technique for structure prediction of small alpha/beta proteins using Rosetta. From an initial round of Rosetta sampling, we learn properties of the energy landscape that guide a subsequent round of sampling toward lower-energy structures. Rather than attempt to fit the full energy landscape, we use feature selection methods--both L1-regularized linear regression and decision trees--to identify structural features that give rise to low energy. We then enrich these structural features in the second sampling round. Results are presented across a benchmark set of nine small alpha/beta proteins demonstrating that our methods seldom impair, and frequently improve, Rosetta's performance.
Selecting Observations against Adversarial Objectives
Krause, Andreas, Mcmahan, Brendan, Guestrin, Carlos, Gupta, Anupam
In many applications, one has to actively select among a set of expensive observations before making an informed decision. Often, we want to select observations which perform well when evaluated with an objective function chosen by an adversary. Examples include minimizing the maximum posterior variance in Gaussian Process regression, robust experimental design, and sensor placement for outbreak detection. In this paper, we present the Submodular Saturation algorithm, a simple and efficient algorithm with strong theoretical approximation guarantees for the case where the possible objective functions exhibit submodularity, an intuitive diminishing returns property. Moreover, we prove that better approximation algorithms do not exist unless NPcomplete problems admit efficient algorithms. We evaluate our algorithm on several real-world problems. For Gaussian Process regression, our algorithm compares favorably with state-of-the-art heuristics described in the geostatistics literature, while being simpler, faster and providing theoretical guarantees. For robust experimental design, our algorithm performs favorably compared to SDP-based algorithms.