Search
Policy-Based Self-Competition for Planning Problems
Pirnay, Jonathan, Göttl, Quirin, Burger, Jakob, Grimm, Dominik Gerhard
AlphaZero-type algorithms may stop improving on single-player tasks in case the value network guiding the tree search is unable to approximate the outcome of an episode sufficiently well. One technique to address this problem is transforming the single-player task through self-competition. The main idea is to compute a scalar baseline from the agent's historical performances and to reshape an episode's reward into a binary output, indicating whether the baseline has been exceeded or not. However, this baseline only carries limited information for the agent about strategies how to improve. We leverage the idea of self-competition and directly incorporate a historical policy into the planning process instead of its scalar performance. Based on the recently introduced Gumbel AlphaZero (GAZ), we propose our algorithm GAZ 'Play-to-Plan' (GAZ PTP), in which the agent learns to find strong trajectories by planning against possible strategies of its past self. We show the effectiveness of our approach in two well-known combinatorial optimization problems, the Traveling Salesman Problem and the Job-Shop Scheduling Problem. With only half of the simulation budget for search, GAZ PTP consistently outperforms all selected single-player variants of GAZ.
Synthesising Recursive Functions for First-Order Model Counting: Challenges, Progress, and Conjectures
Dilkas, Paulius, Belle, Vaishak
First-order model counting (FOMC) is a computational problem that asks to count the models of a sentence in finite-domain first-order logic. In this paper, we argue that the capabilities of FOMC algorithms to date are limited by their inability to express many types of recursive computations. To enable such computations, we relax the restrictions that typically accompany domain recursion and generalise the circuits used to express a solution to an FOMC problem to directed graphs that may contain cycles. To this end, we adapt the most well-established (weighted) FOMC algorithm ForcLift to work with such graphs and introduce new compilation rules that can create cycle-inducing edges that encode recursive function calls. These improvements allow the algorithm to find efficient solutions to counting problems that were previously beyond its reach, including those that cannot be solved efficiently by any other exact FOMC algorithm. We end with a few conjectures on what classes of instances could be domain-liftable as a result.
Meta-SAGE: Scale Meta-Learning Scheduled Adaptation with Guided Exploration for Mitigating Scale Shift on Combinatorial Optimization
Son, Jiwoo, Kim, Minsu, Kim, Hyeonah, Park, Jinkyoo
This paper proposes Meta-SAGE, a novel approach for improving the scalability of deep reinforcement learning models for combinatorial optimization (CO) tasks. Our method adapts pre-trained models to larger-scale problems in test time by suggesting two components: a scale meta-learner (SML) and scheduled adaptation with guided exploration (SAGE). First, SML transforms the context embedding for subsequent adaptation of SAGE based on scale information. Then, SAGE adjusts the model parameters dedicated to the context embedding for a specific instance. SAGE introduces locality bias, which encourages selecting nearby locations to determine the next location. The locality bias gradually decays as the model is adapted to the target instance. Results show that Meta-SAGE outperforms previous adaptation methods and significantly improves scalability in representative CO tasks. Our source code is available at https://github.com/kaist-silab/meta-sage
Simulation-Assisted Optimization for Large-Scale Evacuation Planning with Congestion-Dependent Delays
Islam, Kazi Ashik, Chen, Da Qi, Marathe, Madhav, Mortveit, Henning, Swarup, Samarth, Vullikanti, Anil
Evacuation planning is a crucial part of disaster management. However, joint optimization of its two essential components, routing and scheduling, with objectives such as minimizing average evacuation time or evacuation completion time, is a computationally hard problem. To approach it, we present MIP-LNS, a scalable optimization method that utilizes heuristic search with mathematical optimization and can optimize a variety of objective functions. We also present the method MIP-LNS-SIM, where we combine agent-based simulation with MIP-LNS to estimate delays due to congestion, as well as, find optimized plans considering such delays. We use Harris County in Houston, Texas, as our study area. We show that, within a given time limit, MIP-LNS finds better solutions than existing methods in terms of three different metrics. However, when congestion dependent delay is considered, MIP-LNS-SIM outperforms MIP-LNS in multiple performance metrics. In addition, MIP-LNS-SIM has a significantly lower percent error in estimated evacuation completion time compared to MIP-LNS.
Machine Learning Testing in an ADAS Case Study Using Simulation-Integrated Bio-Inspired Search-Based Testing
Moghadam, Mahshid Helali, Borg, Markus, Saadatmand, Mehrdad, Mousavirad, Seyed Jalaleddin, Bohlin, Markus, Lisper, Björn
This paper presents an extended version of Deeper, a search-based simulation-integrated test solution that generates failure-revealing test scenarios for testing a deep neural network-based lane-keeping system. In the newly proposed version, we utilize a new set of bio-inspired search algorithms, genetic algorithm (GA), $({\mu}+{\lambda})$ and $({\mu},{\lambda})$ evolution strategies (ES), and particle swarm optimization (PSO), that leverage a quality population seed and domain-specific cross-over and mutation operations tailored for the presentation model used for modeling the test scenarios. In order to demonstrate the capabilities of the new test generators within Deeper, we carry out an empirical evaluation and comparison with regard to the results of five participating tools in the cyber-physical systems testing competition at SBST 2021. Our evaluation shows the newly proposed test generators in Deeper not only represent a considerable improvement on the previous version but also prove to be effective and efficient in provoking a considerable number of diverse failure-revealing test scenarios for testing an ML-driven lane-keeping system. They can trigger several failures while promoting test scenario diversity, under a limited test time budget, high target failure severity, and strict speed limit constraints.
Safe Peeling for L0-Regularized Least-Squares with supplementary material
Guyard, Théo, Monnoyer, Gilles, Elvira, Clément, Herzet, Cédric
Abstract--We introduce a new methodology dubbed "safe The rest of the paper is organized as follows. Our peeling This paper focuses on the resolution of the so-called "l IV and its performance is regularized least-squares" problem given by illustrated in Sec. V. Proofs of the results presented in the p Unfortunately, this problem also turns out to be has to be understood component-wise, e.g., x [l,u] means NP-hard [4, Th. 1]. Moreover, η() denotes the indicator function flurry of contributions proposing tractable procedures able to which equals to 0 if the condition in argument is fulfilled and recover approximate solutions of (1-P). This observation, combined with some recent III. Due to space limitation, we only review the elements of A standard approach is to use a Branch-and-Bound (BnB) interest to introduce the proposed peeling method. We refer the algorithm that solves (1-P), see [6-11]. In this paper, we propose a new strategy, dubbed "safe peeling", to accelerate the exact resolution of (1-P). In a A. Pruning nutshell, our contribution is a computationally simple test applied at each node of the BnB decision tree to identify some The crux of BnB methods consists in identifying and discarding intervals of R In this section, we introduce our proposed peeling procedure. This assumption will One ubiquitous approach in the literature [7, 9, 13] to find be relaxed later on in Sec. A proof of this result is available in App. A. A consequence On the one hand, item i) implies that the pruning test (4) of preserving the pruning decision is that taking the new involves the following quantity (rather than p This additional constraint usually takes the form " M x This impairment pertains to a large class of mixed-integer problems and with M > 0 and is known as "Big-M" constraint, see [7, Sec.
AutoPEFT: Automatic Configuration Search for Parameter-Efficient Fine-Tuning
Zhou, Han, Wan, Xingchen, Vulić, Ivan, Korhonen, Anna
Large pretrained language models are widely used in downstream NLP tasks via task-specific fine-tuning, but such procedures can be costly. Recently, Parameter-Efficient Fine-Tuning (PEFT) methods have achieved strong task performance while updating a much smaller number of parameters compared to full model fine-tuning (FFT). However, it is non-trivial to make informed design choices on the PEFT configurations, such as their architecture, the number of tunable parameters, and even the layers in which the PEFT modules are inserted. Consequently, it is highly likely that the current, manually designed configurations are suboptimal in terms of their performance-efficiency trade-off. Inspired by advances in neural architecture search, we propose AutoPEFT for automatic PEFT configuration selection: we first design an expressive configuration search space with multiple representative PEFT modules as building blocks. Using multi-objective Bayesian optimisation in a low-cost setup, we then discover a Pareto-optimal set of configurations with strong performance-cost trade-offs across different numbers of parameters that are also highly transferable across different tasks. Empirically, on GLUE and SuperGLUE tasks, we show that AutoPEFT-discovered configurations significantly outperform existing PEFT methods and are on par or better than FFT, without incurring substantial training efficiency costs.
Learning Search-Space Specific Heuristics Using Neural Networks
Liu, Yu, Kuroiwa, Ryo, Fukunaga, Alex
We propose and evaluate a system which learns a neuralnetwork heuristic function for forward search-based, satisficing classical planning. Our system learns distance-to-goal estimators from scratch, given a single PDDL training instance. Training data is generated by backward regression search or by backward search from given or guessed goal states. In domains such as the 24-puzzle where all instances share the same search space, such heuristics can also be reused across all instances in the domain. We show that this relatively simple system can perform surprisingly well, sometimes competitive with well-known domain-independent heuristics.
Learning-Based Heuristic for Combinatorial Optimization of the Minimum Dominating Set Problem using Graph Convolutional Networks
Kothapalli, Abihith, Shabbir, Mudassir, Koutsoukos, Xenofon
These optimization problems offer a means to model highly intricate discrete decision problems across diverse domains where pairwise interactions play a crucial role, such as social network analysis [1], wireless communications [2], operations research [3], scheduling [4], and transportation [5]. A considerable portion of these problems belongs to the broader class of NP-hard problems, where it is challenging to find exact solutions, as doing so often necessitates a near-complete enumeration of the entire search space. Consequently, computation of exact solutions is practically infeasible, and approximation or heuristic algorithms are generally favored for practical applications. Although these algorithms exhibit significantly faster runtime and possess sub-exponential theoretical complexities, they often yield suboptimal solutions. Therefore, a key area of research revolves around the development of approximation or heuristic algorithms that can provide solutions that are as close to optimal as possible. The minimum dominating set (MDS) problem is an important network-based optimization problem that involves finding the smallest dominating set of a given graph. A dominating set of a graph is a subset of the vertices in the graph such that every vertex is either in the dominating set or adjacent to a vertex in the dominating set. The MDS problem aims to find the dominating set of minimum cardinality. Dominating sets have a wide range of applications in various fields, including social networks [6-8], cybersecurity [9], biological networks [10], bioinformatics [11], multi-document summarization [12], and wireless sensor networks [13]
Rigorous Runtime Analysis of MOEA/D for Solving Multi-Objective Minimum Weight Base Problems
Do, Anh Viet, Neumann, Aneta, Neumann, Frank, Sutton, Andrew M.
We study the multi-objective minimum weight base problem, an abstraction of classical NP-hard combinatorial problems such as the multi-objective minimum spanning tree problem. We prove some important properties of the convex hull of the non-dominated front, such as its approximation quality and an upper bound on the number of extreme points. Using these properties, we give the first run-time analysis of the MOEA/D algorithm for this problem, an evolutionary algorithm that effectively optimizes by decomposing the objectives into single-objective components. We show that the MOEA/D, given an appropriate decomposition setting, finds all extreme points within expected fixed-parameter polynomial time in the oracle model, the parameter being the number of objectives. Experiments are conducted on random bi-objective minimum spanning tree instances, and the results agree with our theoretical findings. Furthermore, compared with a previously studied evolutionary algorithm for the problem GSEMO, MOEA/D finds all extreme points much faster across all instances.