Goto

Collaborating Authors

 Search


Greedy or Not? Best Improving versus First Improving Stochastic Local Search for MAXSAT

AAAI Conferences

Stochastic local search (SLS) is the dominant paradigm forย incomplete SAT and MAXSAT solvers. Early studies onย small 3SAT instances found that the use of โ€œbest improvingโ€ย moves did not improve search compared to using an arbitraryย โ€œfirst improvingโ€ move. Yet SLS algorithms continue to useย best improving moves. We revisit this issue by studying veryย large random and industrial MAXSAT problems. Because locatingย best improving moves is more expensive than first improvingย moves, we designed an โ€œapproximate bestโ€ improvingย move algorithm and prove that it is as efficient as first improvingย move SLS. For industrial problems the first local optimaย found using best improving moves are statistically significantlyย better than local optima found using first improvingย moves. However, this advantage reverses as search continuesย and algorithms must explore equal moves on plateaus.ย This reversal appears to be associated with critical variablesย that are in many clauses and that also yield large improvingย moves.


Partial Domain Search Tree For Constraint-Satisfaction Problems

AAAI Conferences

CSP solvers usually search a partial assignment search tree.We present a new formalization for CSP solvers, which spansa conceptually different search tree, where each node representssubsets of the original domains for the variables. We experimentwith a simple backtracking algorithm for this searchtree and show that it outperforms a simple backtracking algorithmon the traditional search tree in many cases.


Automated Design of Search with Composability

AAAI Conferences

We propose a new perspective on the automated design of combinatorial search algorithms through an approach that operates at a much higher semantic level than previous algorithm configurators do. Instead of blindly tuning numerical or categorical parameters based on black-box optimization or resorting to a handful of predefined strategies, we propose to automatically search over compositions of search strategies using a light-weight language, while exploiting the semantic knowledge of the modeling language itself to guide the configuration process. Although somewhat reminiscent of the old AI vision that machines will be able to program themselves to solve novel tasks, we believe that the idea restricted to this simple but powerful search language has a chance of success in practice.


Throwing Darts: Random Sampling Helps Tree Search when the Number of Short Certificates is Moderate

AAAI Conferences

One typically proves infeasibility in satisfiability/constraint satisfaction (or optimality in integer programming) by constructing a tree certificate. However, deciding how to branch in the search tree is hard, and impacts search time drastically. We explore the power of a simple paradigm, that of throwing random darts into the assignment space and then using information gathered by that dart to guide what to do next. This method seems to work well when the number of short certificates of infeasibility is moderate, suggesting that the overhead of throwing darts is more than paid for by the information gained by these darts.


Heuristic Search for Large Problems with Real Costs

AAAI Conferences

Heuristic search is a fundamental technique for solving problems in artificial intelligence. However, many heuristic search algorithms, such as A* are limited by the amount of main memory available. External memory search overcomes the memory limitation of A* by taking advantage of cheap secondary storage, such as disk. Previous work in this area assumes that edge costs fall within a narrow range of integer values and relies on uniformed search order. The goal of this dissertation research is to develop novel techniques that enable heuristic search algorithms to solve problems with real values using a best-first search order while exploiting external memory and multiple processors. This work will be organized into four components. The first component will discuss external memory search and present a novel technique for incorporating real-valued edge costs. The second component will present a novel algorithm for solving problems with large branching factors with application to the challenging problem of Multiple Sequence Alignment (MSA). The third component will cover bounded suboptimal external search for practical MSA applications. The final component of this research will be the development of a novel distributed search framework; allowing parallel and external memory heuristic search algorithms to run cooperatively on a commodity computing cluster. Together these four components will enable heuristic search to scale to large problems in practical settings while exploiting modern hardware.


Robust Network Design For Multispecies Conservation

AAAI Conferences

Our work is motivated by an important network design application in computational sustainability concerning wildlife conservation. In the face of human development and climate change, it is important that conservation plans for protecting landscape connectivity exhibit certain level of robustness. While previous work has focused on conservation strategies that result in a connected network of habitat reserves, the robustness of the proposed solutions has not been taken into account. In order to address this important aspect, we formalize the problem as a node-weighted bi-criteria network design problem with connectivity requirements on the number of disjoint paths between pairs of nodes. While in most previous work on survivable network design the objective is to minimize the cost of the selected network, our goal is to optimize the quality of the selected paths within a specified budget, while meeting the connectivity requirements. We characterize the complexity of the problem under different restrictions. We provide a mixed-integer programming encoding that allows for finding solutions with optimality guarantees, as well as a hybrid local search method with better scaling behavior but no guarantees. We evaluate the typical-case performance of our approaches using a synthetic benchmark, and apply them to a large-scale real-world network design problem concerning the conservation of wolverine and lynx populations in the U.S. Rocky Mountains (Montana).


Mixed Heuristic Local Search for Protein Structure Prediction

AAAI Conferences

Protein structure prediction is an unsolved problem in computational biology. One great difficulty is due to the unknown factors in the actual energy function. Moreover, the energy models available are often not very informative particularly when spatially similar structures are compared during search. We introduce several novel heuristics to augment the energy model and present a new local search algorithm that exploits these heuristics in a mixed fashion. Although the heuristics individually are weaker in performance than the energy function, their combination interestingly produces stronger results. For standard benchmark proteins on the face centered cubic lattice and a realistic 20x20 energy model, we obtain structures with significantly lower energy than those obtained by the state-of-the-art algorithms. We also report results for these proteins using the same energy model on the cubic lattice.


Reciprocal Hash Tables for Nearest Neighbor Search

AAAI Conferences

Recent years have witnessed the success of hashingtechniques in approximate nearest neighbor search. Inpractice, multiple hash tables are usually employed toretrieve more desired results from all hit buckets ofeach table. However, there are rare works studying theunified approach to constructing multiple informativehash tables except the widely used random way. In thispaper, we regard the table construction as a selectionproblem over a set of candidate hash functions. Withthe graph representation of the function set, we proposean efficient solution that sequentially applies normal-ized dominant set to finding the most informative andindependent hash functions for each table. To furtherreduce the redundancy between tables, we explore thereciprocal hash tables in a boosting manner, where thehash function graph is updated with high weights em-phasized on the misclassified neighbor pairs of previoushash tables. The construction method is general andcompatible with different types of hashing algorithmsusing different feature spaces and/or parameter settings.Extensive experiments on two large-scale benchmarksdemonstrate that the proposed method outperforms bothnaive construction method and state-of-the-art hashingalgorithms, with up to 65.93% accuracy gains.


Walking on Minimax Paths for k-NN Search

AAAI Conferences

Link-based dissimilarity measures, such as shortest path or Euclidean commute time distance, base their distance on paths between nodes of a weighted graph. These measures are known to be better suited to data manifold with nonconvex-shaped clusters, compared to Euclidean distance, so that k -nearest neighbor (NN) search is improved in such metric spaces. In this paper we present a new link-based dissimilarity measure based on minimax paths between nodes. Two main benefits of minimax path-based dissimilarity measure are: (1) only a subset of paths is considered to make it scalable, while Euclidean commute time distance considers all possible paths; (2) it better captures nonconvex-shaped cluster structure, compared to shortest path distance. We define the total cost assigned to a path between nodes as L p norm of intermediate costs of edges involving the path, showing that minimax path emerges from our L p norm over paths framework. We also define minimax distance as the intermediate cost of the longest edge on the minimax path, then present a greedy algorithm to compute k smallest minimax distances between a query and N data points in O(log N + k log k) time. Numerical experiments demonstrate that our minimax k-NN algorithm reduce the search time by several orders of magnitude, compared to existing methods, while the quality of k -NN search is significantly improved over Euclidean distance.


Resolution and Parallelizability: Barriers to the Efficient Parallelization of SAT Solvers

AAAI Conferences

Recent attempts to create versions of Satisfiability (SAT) solversthat exploit parallel hardware and information sharing have met withlimited success. In fact,the most successful parallel solvers in recent competitions were basedon portfolio approaches with little to no exchange of informationbetween processors. This experience contradicts the apparentparallelizability of exploring a combinatorial search space. Wepresent evidence that this discrepancy can be explained by studyingSAT solvers through a proof complexity lens, as resolution refutationengines. Starting with theobservation that a recently studied measure of resolution proofs,namely depth, provides a (weak) upper bound to the best possiblespeedup achievable by such solvers, we empirically show the existenceof bottlenecks to parallelizability that resolution proofs typicallygenerated by SAT solvers exhibit. Further, we propose a new measureof parallelizability based on the best-case makespan of an offlineresource constrained scheduling problem. This measureexplicitly accounts for a bounded number of parallel processors andappears to empirically correlate with parallel speedups observed inpractice. Our findings suggest that efficient parallelization of SATsolvers is not simply a matter of designing the right clause sharingheuristics; even in the best case, it can be --- and indeed is ---hindered by the structure of the resolution proofs current SAT solverstypically produce.