Goto

Collaborating Authors

 Domshlak, Carmel


On Interruptible Pure Exploration in Multi-Armed Bandits

AAAI Conferences

Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate performance improvement over time. Our experimental evaluation demonstrates that the corresponding instances of DB favorably compete both with the currently popular strategies UCB1 and Epsilon-Greedy, as well as with the conservative uniform sampling.


Multi-Agent Coordination Off-Line: Structure and Complexity

AAAI Conferences

Coordination between processing entities is one of the most widely studied areas in multi-agent planning research. Recently, efforts have been made to understand the formal computational issues of this important area. In this paper, we make a step toward this direction, and analyze a certain class of coordination problems for dependent agents with independent goals acting in the same environment. We assume that a state-transition description of each agent is given, and that preconditioning an agent's transitions by the states of other agents is the only considered kind of inter-agent dependence. Off-line coordination between the agents is considered. We analyze some structural properties of these problems, and investigate the relationship between these properties and the complexity of coordination in this domain. We show that our general problem is provably intractable, but some significant subclasses are in NP and even polynomial.


Generic Preferences over Subsets of Structured Objects

arXiv.org Artificial Intelligence

Various tasks in decision making and decision support systems require selecting a preferred subset of a given set of items. Here we focus on problems where the individual items are described using a set of characterizing attributes, and a generic preference specification is required, that is, a specification that can work with an arbitrary set of items. For example, preferences over the content of an online newspaper should have this form: At each viewing, the newspaper contains a subset of the set of articles currently available. Our preference specification over this subset should be provided offline, but we should be able to use it to select a subset of any currently available set of articles, e.g., based on their tags. We present a general approach for lifting formalisms for specifying preferences over objects with multiple attributes into ones that specify preferences over subsets of such objects. We also show how we can compute an optimal subset given such a specification in a relatively efficient manner. We provide an empirical evaluation of the approach as well as some worst-case complexity results.


Data-Parallel Computing Meets STRIPS

AAAI Conferences

The increased demand for distributed computations on “big data” has led to solutions such as SCOPE, DryadLINQ, Pig, and Hive, which allow the user to specify queries in an SQL-like language, enriched with sets of user-defined operators. The lack of exact semantics for user-defined operators interferes with the query optimization process, thus putting the burden of suggesting, at least partial, query plans on the user. In an attempt to ease this burden, we propose a formal model that allows for data-parallel program synthesis (DPPS) in a semantically well-defined manner. We show that this model generalizes existing frameworks for data-parallel computation, while providing the flexibility of query plan generation that is currently absent from these frameworks. In particular, we show how existing, off-the-shelf, AI planning tools can be used for solving DPPS tasks.


Red-Black Relaxed Plan Heuristics

AAAI Conferences

Despite its success, the delete relaxation has significant pitfalls. Recent work has devised the red-black planning framework, where red variables take the relaxed semantics (accumulating their values), while black variables take the regular semantics. Provided the red variables are chosen so that red-black plan generation is tractable, one can generate such a plan for every search state, and take its length as the heuristic distance estimate. Previous results were not suitable for this purpose because they identified tractable fragments for red-black plan existence, as opposed to red-black plan generation. We identify a new fragment of red-black planning, that fixes this issue. We devise machinery to efficiently generate red-black plans, and to automatically select the red variables. Experiments show that the resulting heuristics can significantly improve over standard delete relaxation heuristics.


Symmetry Breaking: Satisficing Planning and Landmark Heuristics

AAAI Conferences

Searching for computational tools that can further push the boundary of satisficing planning, we show that reasoning about state-space symmetries can substantially improve even the most effective heuristic-search satisficing planners, with respect to all standard performance measures. The improvement comes from the state-space pruning, as well as from transparent cost-to-state updates and heuristic enhancement by information obtained during the search at different symmetric states.


Simple Regret Optimization in Online Planning for Markov Decision Processes

arXiv.org Artificial Intelligence

We consider online planning in Markov decision processes (MDPs). In online planning, the agent focuses on its current state only, deliberates about the set of possible policies from that state onwards and, when interrupted, uses the outcome of that exploratory deliberation to choose what action to perform next. The performance of algorithms for online planning is assessed in terms of simple regret, which is the agent's expected performance loss when the chosen action, rather than an optimal one, is followed. To date, state-of-the-art algorithms for online planning in general MDPs are either best effort, or guarantee only polynomial-rate reduction of simple regret over time. Here we introduce a new Monte-Carlo tree search algorithm, BRUE, that guarantees exponential-rate reduction of simple regret and error probability. This algorithm is based on a simple yet non-standard state-space sampling scheme, MCTS2e, in which different parts of each sample are dedicated to different exploratory objectives. Our empirical evaluation shows that BRUE not only provides superior performance guarantees, but is also very effective in practice and favorably compares to state-of-the-art. We then extend BRUE with a variant of "learning by forgetting." The resulting set of algorithms, BRUE(alpha), generalizes BRUE, improves the exponential factor in the upper bound on its reduction rate, and exhibits even more attractive empirical performance.


Introducing Variable Importance Tradeoffs into CP-Nets

arXiv.org Artificial Intelligence

The ability to make decisions and to assess potential courses of action is a corner-stone of many AI applications, and usually this requires explicit information about the decision-maker s preferences. IN many applications, preference elicitation IS a serious bottleneck.The USER either does NOT have the time, the knowledge, OR the expert support required TO specify complex multi - attribute utility functions. IN such cases, a method that IS based ON intuitive, yet expressive, preference statements IS required. IN this paper we suggest the USE OF TCP - nets, an enhancement OF CP - nets, AS a tool FOR representing, AND reasoning about qualitative preference statements.We present AND motivate this framework, define its semantics, AND show how it can be used TO perform constrained optimization.


Compact Value-Function Representations for Qualitative Preferences

arXiv.org Artificial Intelligence

We consider the challenge of preference elicitation in systems that help users discover the most desirable item(s) within a given database. Past work on preference elicitation focused on structured models that provide a factored representation of users' preferences. Such models require less information to construct and support efficient reasoning algorithms. This paper makes two substantial contributions to this area: (1) Strong representation theorems for factored value functions. (2) A methodology that utilizes our representation results to address the problem of optimal item selection.


Optimal Search with Inadmissible Heuristics

AAAI Conferences

Considering cost-optimal heuristic search, we introduce the notion of global admissibility of a heuristic, a property weaker than standard admissibility, yet sufficient for guaranteeing solution optimality within forward search. We describe a concrete approach for creating globally admissible heuristics for domain independent planning; it is based on exploiting information gradually gathered by the search via a new form of reasoning about what we call existential optimal-plan landmarks. We evaluate our approach on some state-of-the-art heuristic search tools for cost-optimal planning, and discuss the results of this evaluation.