Bandit-Based Search for Constraint Programming

Loth, Manuel (MSR–INRIA Joint Centre) | Sebag, Michèle (Université Paris-Sud) | Hamadi, Youssef (Microsoft Research, Cambridge) | Schulte, Christian (KTH Royal Institute of Technology) | Schoenauer, Marc

Jul-9-2013–AAAI Conferences

Constraint Programming (CP) solvers classically explore the solution space using tree-search based heuristics. Monte-Carlo Tree-Search (MCTS) is a tree-search method aimed at optimal sequential decision making under uncertainty. At the crossroads of CP and MCTS, this paper presents the Bandit Search for Constraint Programming (BASCOP) algorithm, adapting MCTS to the specifics of CP search trees. Formally, MCTS simultaneously estimates the average node reward, and uses it to bias the exploration towards the most promising regions of the tree, borrowing the multi-armed bandit (MAB) decision rule. The two contributions in BASCOP concern i) a specific reward function, estimating the relative failure depth conditionally to a (variable, value) assignment; ii) a new decision rule, hybridizing the MAB framework and the spirit of local neighborhood search. Specifically, BASCOP guides the CP search in the neighborhood of the previous best solution, by exploiting statistical estimates gathered across multiple restarts. BASCOP, using Gecode as the underlying constraint solver, shows significant improvements over the depth-first search baseline on some CP benchmark suites. For hard job-shop scheduling problems, BASCOP matches the results of state-of-the-art scheduling-specific CP approaches. These results demonstrate the potential of BASCOP as a generic yet robust search method for CP.

bandit-based search, constraint programming

AAAI Conferences

Jul-9-2013

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning
  - Constraint-Based Reasoning (1.00)
  - Search (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found