Search
Toward an Understanding of Local Search Cost in Job-Shop Scheduling
Watson, Jean-Paul (Sandia National Laboratories) | Beck, Chris (University of Toronto) | Howe, Adele (Colorado State University) | Whitley, Darrell (Colorado State University)
Local search algorithms are among the most effective approaches for solving the JSP, yet we have little understanding of why these algorithm work so well, and under what conditions. We develop descriptive cost models of local search for the job-shop scheduling problem (JSP), borrowing from the models developed for MAX-SAT. We show that several factors known to influence the difficulty of local search in MAX-SAT directly carry over to the general JSP, including the number of optimal solutions, backbone size, the distance between initial solutions and the nearest optimal solution, and an analog of backbone robustness. However, these same factors only weakly influence local search cost in JSPs with workflow, which possess structured constraints. While the factors present in the MAX-SAT cost models provide an accurate description of local search cost in the general JSP, our results for workflow JSPs raise concerns regarding the applicability of cost models derived using random problems to those exhibiting specific structure.
Learning directed acyclic graphs via bootstrap aggregating
Probabilistic graphical models are graphical representations of probability distributions. Graphical models have applications in many fields including biology, social sciences, linguistic, neuroscience. In this paper, we propose directed acyclic graphs (DAGs) learning via bootstrap aggregating. The proposed procedure is named as DAGBag. Specifically, an ensemble of DAGs is first learned based on bootstrap resamples of the data and then an aggregated DAG is derived by minimizing the overall distance to the entire ensemble. A family of metrics based on the structural hamming distance is defined for the space of DAGs (of a given node set) and is used for aggregation. Under the high-dimensional-low-sample size setting, the graph learned on one data set often has excessive number of false positive edges due to over-fitting of the noise. Aggregation overcomes over-fitting through variance reduction and thus greatly reduces false positives. We also develop an efficient implementation of the hill climbing search algorithm of DAG learning which makes the proposed method computationally competitive for the high-dimensional regime. The DAGBag procedure is implemented in the R package dagbag.
Reconnection with the Ideal Tree: A New Approach to Real-Time Search
Rivera, N., Illanes, L., Baier, J. A., Hernandez, C.
Many applications, ranging from video games to dynamic robotics, require solving single-agent, deterministic search problems in partially known environments under very tight time constraints. Real-Time Heuristic Search (RTHS) algorithms are specifically designed for those applications. As a subroutine, most of them invoke a standard, but bounded, search algorithm that searches for the goal. In this paper we present FRIT, a simple approach for single-agent deterministic search problems under tight constraints and partially known environments that unlike traditional RTHS does not search for the goal but rather searches for a path that connects the current state with a so-called ideal tree T . When the agent observes that an arc in the tree cannot be traversed in the actual environment, it removes such an arc from T and then carries out a reconnection search whose objective is to find a path between the current state and any node in T . The reconnection search is done using an algorithm that is passed as a parameter to FRIT. If such a parameter is an RTHS algorithm, then the resulting algorithm can be an RTHS algorithm. We show, in addition, that FRIT may be fed with a (bounded) complete blind-search algorithm. We evaluate our approach over grid pathfinding benchmarks including game maps and mazes. Our results show that FRIT, used with RTAA*, a standard RTHS algorithm, outperforms RTAA* significantly; by one order of magnitude under tight time constraints. In addition, FRIT(daRTAA*) substantially outperforms daRTAA*, a state-of-the-art RTHS algorithm, usually obtaining solutions 50% cheaper on average when performing the same search effort. Finally, FRIT(BFS), i.e., FRIT using breadth-first-search, obtains best-quality solutions when time is limited compared to Adaptive A* and Repeated A*. Finally we show that Bug2, a pathfinding-specific navigation algorithm, outperforms FRIT(BFS) when planning time is extremely limited, but when given more time, the situation reverses.
Enhanced Partial Expansion A*
Goldenberg, M., Felner, A., Stern, R., Sharon, G., Sturtevant, N., Holte, R. C., Schaeffer, J.
When solving instances of problem domains that feature a large branching factor, A* may generate a large number of nodes whose cost is greater than the cost of the optimal solution. We designate such nodes as surplus. Generating surplus nodes and adding them to the OPEN list may dominate both time and memory of the search. A recently introduced variant of A* called Partial Expansion A* (PEA*) deals with the memory aspect of this problem. When expanding a node n, PEA* generates all of its children and puts into OPEN only the children with f = f (n). n is re-inserted in the OPEN list with the f -cost of the best discarded child. This guarantees that surplus nodes are not inserted into OPEN. In this paper, we present a novel variant of A* called Enhanced Partial Expansion A* (EPEA*) that advances the idea of PEA* to address the time aspect. Given a priori domain- and heuristic- specific knowledge, EPEA* generates only the nodes with f = f(n). Although EPEA* is not always applicable or practical, we study several variants of EPEA*, which make it applicable to a large number of domains and heuristics. In particular, the ideas of EPEA* are applicable to IDA* and to the domains where pattern databases are traditionally used. Experimental studies show significant improvements in run-time and memory performance for several standard benchmark applications. We provide several theoretical studies to facilitate an understanding of the new algorithm.
BliStr: The Blind Strategymaker
BliStr is a system that automatically develops strategies for E prover on a large set of problems. The main idea is to interleave (i) iterated low-timelimit local search for new strategies on small sets of similar easy problems with (ii) higher-timelimit evaluation of the new strategies on all problems. The accummulated results of the global higher-timelimit runs are used to define and evolve the notion of "similar easy problems", and to control the selection of the next strategy to be improved. The technique was used to significantly strengthen the set of E strategies used by the MaLARea, PS-E, E-MaLeS, and E systems in the CASC@Turing 2012 competition, particularly in the Mizar division. Similar improvement was obtained on the problems created from the Flyspeck corpus.
Solving the Minimum Common String Partition Problem with the Help of Ants
Ferdous, S. M., Rahman, M. Sohel
In this paper, we consider the problem of finding a minimum common partition of two strings. The problem has its application in genome comparison. As it is an NPhard, discrete combinatorial optimization problem, we employ a metaheuristic technique, namely, MAX-MIN ant system to solve this problem. To achieve better efficiency we first map the problem instance into a special kind of graph. Subsequently, we employ a MAX-MIN ant system to achieve high quality solutions for the problem. Experimental results show the superiority of our algorithm in comparison with the state of art algorithm in the literature. The improvement achieved is also justified by standard statistical test. Keywords: Ant Colony Optimization, Stringology, Genome sequencing, Combinatorial Optimization, Swarm Intelligence, String partitioning 1. Introduction String comparison is one of the important problems in Computer Science with diverse applications in different areas including Genome Sequencing, text processing and compressions. In this paper, we address the problem of finding a minimum common partition (MCSP) of two strings. MCSP is closely related to genome arrangement which is an important topic in computational biology.
BTT-Go: An Agent for Go that Uses a Transposition Table to Reduce the Simulations and the Supervision in the Monte-Carlo Tree Search
Junior, Eldane Vieira (Federal University of Uberlรขndia) | Julia, Rita Maria (Federal University of Uberlรขndia)
This paper presents BTT-Go: an agent for Go whose ar- chitecture is based on the well-known agent Fuego, that is, its search process for the best move is based on sim- ulations of games performed by means of Monte- Carlo Tree Search (MCTS). In Fuego, these simulations are guided by supervised heuristics called prior knowledge and play-out policy. In this context, the goal behind the BTT-Go proposal is to reduce the supervised character of Fuego, granting it more autonomy. To cope with this task, the BTT-Go counts on a Transposition Table (TT) whose role is not to waste the history of the nodes that have already been explored throughout the game. By this way, the agent proposed here reduces the super- vised character of Fuego by replacing, whenever pos- sible, the prior knowledge and the play-out policy with the information retrieved from the TT. Several evalua- tive tournaments involving BTT-Go and Fuego confirm that the former obtains satisfactory results in its purpose of attenuating the supervision in Fuego without losing its competitiveness, even in 19x19 game-boards.
Hybrid Metaheuristics for the Clustered Vehicle Routing Problem
Vidal, Thibaut, Battarra, Maria, Subramanian, Anand, Erdoวงan, Gรผneล
The Clustered Vehicle Routing Problem (CluVRP) is a variant of the Capacitated Vehicle Routing Problem in which customers are grouped into clusters. Each cluster has to be visited once, and a vehicle entering a cluster cannot leave it until all customers have been visited. This article presents two alternative hybrid metaheuristic algorithms for the CluVRP. The first algorithm is based on an Iterated Local Search algorithm, in which only feasible solutions are explored and problem-specific local search moves are utilized. The second algorithm is a Hybrid Genetic Search, for which the shortest Hamiltonian path between each pair of vertices within each cluster should be precomputed. Using this information, a sequence of clusters can be used as a solution representation and large neighborhoods can be efficiently explored by means of bi-directional dynamic programming, sequence concatenations, by using appropriate data structures. Extensive computational experiments are performed on benchmark instances from the literature, as well as new large scale ones. Recommendations on promising algorithm choices are provided relatively to average cluster size.
Fast Exact Search in Hamming Space with Multi-Index Hashing
Norouzi, Mohammad, Punjani, Ali, Fleet, David J.
There is growing interest in representing image data and feature descriptors using compact binary codes for fast near neighbor search. Although binary codes are motivated by their use as direct indices (addresses) into a hash table, codes longer than 32 bits are not being used as such, as it was thought to be ineffective. We introduce a rigorous way to build multiple hash tables on binary code substrings that enables exact k-nearest neighbor search in Hamming space. The approach is storage efficient and straightforward to implement. Theoretical analysis shows that the algorithm exhibits sub-linear run-time behavior for uniformly distributed codes. Empirical results show dramatic speedups over a linear scan baseline for datasets of up to one billion codes of 64, 128, or 256 bits.
Budgeted Influence Maximization for Multiple Products
Du, Nan, Liang, Yingyu, Balcan, Maria Florina, Song, Le
The typical algorithmic problem in viral marketing aims to identify a set of influential users in a social network, who, when convinced to adopt a product, shall influence other users in the network and trigger a large cascade of adoptions. However, the host (the owner of an online social platform) often faces more constraints than a single product, endless user attentions, unlimited budget and unbounded time; in reality, multiple products need to be advertised, each user can tolerate only a small number of recommendations, influencing user has a cost and advertisers have only limited budgets, and the adoptions need to be maximized within a short time window. Given theses myriads of user, monetary, and timing constraints, it is extremely challenging for the host to design principled and efficient viral market algorithms with provable guarantees. In this paper, we provide a novel solution by formulating the problem as a submodular maximization in a continuous-time diffusion model under an intersection of a matroid and multiple knapsack constraints. We also propose an adaptive threshold greedy algorithm which can be faster than the traditional greedy algorithm with lazy evaluation, and scalable to networks with million of nodes. Furthermore, our mathematical formulation allows us to prove that the algorithm can achieve an approximation factor of $k_a/(2+2 k)$ when $k_a$ out of the $k$ knapsack constraints are active, which also improves over previous guarantees from combinatorial optimization literature. In the case when influencing each user has uniform cost, the approximation becomes even better to a factor of $1/3$. Extensive synthetic and real world experiments demonstrate that our budgeted influence maximization algorithm achieves the-state-of-the-art in terms of both effectiveness and scalability, often beating the next best by significant margins.