Optimization
A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions
Kapoutsis, Athanasios Ch., Chatzichristofis, Savvas A., Kosmatopoulos, Elias B.
This paper presents a distributed algorithm applicable to a wide range of practical multi-robot applications. In such multi-robot applications, the user-defined objectives of the mission can be cast as a general optimization problem, without explicit guidelines of the subtasks per different robot. Owing to the unknown environment, unknown robot dynamics, sensor nonlinearities, etc., the analytic form of the optimization cost function is not available a priori. Therefore, standard gradient-descent-like algorithms are not applicable to these problems. To tackle this, we introduce a new algorithm that carefully designs each robot's subcost function, the optimization of which can accomplish the overall team objective. Upon this transformation, we propose a distributed methodology based on the cognitive-based adaptive optimization (CAO) algorithm, that is able to approximate the evolution of each robot's cost function and to adequately optimize its decision variables (robot actions). The latter can be achieved by online learning only the problem-specific characteristics that affect the accomplishment of mission objectives. The overall, low-complexity algorithm can straightforwardly incorporate any kind of operational constraint, is fault-tolerant, and can appropriately tackle time-varying cost functions. A cornerstone of this approach is that it shares the same convergence characteristics as those of block coordinate descent algorithms. The proposed algorithm is evaluated in three heterogeneous simulation set-ups under multiple scenarios, against both general-purpose and problem-specific algorithms. Source code is available at https://github.com/athakapo/A-distributed-plug-n-play-algorithm-for-multi-robot-applications.
An Efficient Combinatorial Optimization Model Using Learning-to-Rank Distillation
Woo, Honguk, Lee, Hyunsung, Cho, Sangwoo
Recently, deep reinforcement learning (RL) has proven its feasibility in solving combinatorial optimization problems (COPs). The learning-to-rank techniques have been studied in the field of information retrieval. While several COPs can be formulated as the prioritization of input items, as is common in the information retrieval, it has not been fully explored how the learning-to-rank techniques can be incorporated into deep RL for COPs. In this paper, we present the learning-to-rank distillation-based COP framework, where a high-performance ranking policy obtained by RL for a COP can be distilled into a non-iterative, simple model, thereby achieving a low-latency COP solver. Specifically, we employ the approximated ranking distillation to render a score-based ranking model learnable via gradient descent. Furthermore, we use the efficient sequence sampling to improve the inference performance with a limited delay. With the framework, we demonstrate that a distilled model not only achieves comparable performance to its respective, high-performance RL, but also provides several times faster inferences. We evaluate the framework with several COPs such as priority-based task scheduling and multidimensional knapsack, demonstrating the benefits of the framework in terms of inference latency and performance.
Practical Fixed-Parameter Algorithms for Defending Active Directory Style Attack Graphs
Guo, Mingyu, Li, Jialiang, Neumann, Aneta, Neumann, Frank, Nguyen, Hung
Active Directory is the default security management system for Windows domain networks. We study the shortest path edge interdiction problem for defending Active Directory style attack graphs. The problem is formulated as a Stackelberg game between one defender and one attacker. The attack graph contains one destination node and multiple entry nodes. The attacker's entry node is chosen by nature. The defender chooses to block a set of edges limited by his budget. The attacker then picks the shortest unblocked attack path. The defender aims to maximize the expected shortest path length for the attacker, where the expectation is taken over entry nodes. We observe that practical Active Directory attack graphs have small maximum attack path lengths and are structurally close to trees. We first show that even if the maximum attack path length is a constant, the problem is still $W[1]$-hard with respect to the defender's budget. Having a small maximum attack path length and a small budget is not enough to design fixed-parameter algorithms. If we further assume that the number of entry nodes is small, then we derive a fixed-parameter tractable algorithm. We then propose two other fixed-parameter algorithms by exploiting the tree-like features. One is based on tree decomposition and requires a small tree width. The other assumes a small number of splitting nodes (nodes with multiple out-going edges). Finally, the last algorithm is converted into a graph convolutional neural network based heuristic, which scales to larger graphs with more splitting nodes.
Manager, use this to plan the smart working days for your team!
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. During covid-19 several teams were constrained to work from home entirely.
Integrating Quantum Processor Device and Control Optimization in a Gradient-based Framework
Ni, Xiaotong, Zhao, Hui-Hai, Wang, Lei, Wu, Feng, Chen, Jianxin
In a quantum processor, the device design and external controls together contribute to the quality of the target quantum operations. As we continuously seek better alternative qubit platforms, we explore the increasingly large device and control design space. Thus, optimization becomes more and more challenging. In this work, we demonstrate that the figure of merit reflecting a design goal can be made differentiable with respect to the device and control parameters. In addition, we can compute the gradient of the design objective efficiently in a similar manner to the back-propagation algorithm and then utilize the gradient to optimize the device and the control parameters jointly and efficiently. This extends the scope of the quantum optimal control to superconducting device design. We also demonstrate the viability of gradient-based joint optimization over the device and control parameters through a few examples.
Reinforcement Learning based Sequential Batch-sampling for Bayesian Optimal Experimental Design
Ashenafi, Yonatan, Pandita, Piyush, Ghosh, Sayan
Engineering problems that are modeled using sophisticated mathematical methods or are characterized by expensive-to-conduct tests or experiments, are encumbered with limited budget or finite computational resources. Moreover, practical scenarios in the industry, impose restrictions, based on logistics and preference, on the manner in which the experiments can be conducted. For example, material supply may enable only a handful of experiments in a single-shot or in the case of computational models one may face significant wait-time based on shared computational resources. In such scenarios, one usually resorts to performing experiments in a manner that allows for maximizing one's state-of-knowledge while satisfying the above mentioned practical constraints. Sequential design of experiments (SDOE) is a popular suite of methods, that has yielded promising results in recent years across different engineering and practical problems. A common strategy, that leverages Bayesian formalism is the Bayesian SDOE, which usually works best in the one-step-ahead or myopic scenario of selecting a single experiment at each step of a sequence of experiments. In this work, we aim to extend the SDOE strategy, to query the experiment or computer code at a batch of inputs. To this end, we leverage deep reinforcement learning (RL) based policy gradient methods, to propose batches of queries that are selected taking into account entire budget in hand. The algorithm retains the sequential nature, inherent in the SDOE, while incorporating elements of reward based on task from the domain of deep RL. A unique capability of the proposed methodology is its ability to be applied to multiple tasks, for example optimization of a function, once its trained. We demonstrate the performance of the proposed algorithm on a synthetic problem, and a challenging high-dimensional engineering problem.
Faster Convergence in Multi-Objective Optimization Algorithms Based on Decomposition
Lavinas, Yuri, Ladeira, Marcelo, Aranha, Claus
The Resource Allocation approach (RA) improves the performance of MOEA/D by maintaining a big population and updating few solutions each generation. However, most of the studies on RA generally focused on the properties of different Resource Allocation metrics. Thus, it is still uncertain what the main factors are that lead to increments in performance of MOEA/D with RA. This study investigates the effects of MOEA/D with the Partial Update Strategy in an extensive set of MOPs to generate insights into correspondences of MOEA/D with the Partial Update and MOEA/D with small population size and big population size. Our work undertakes an in-depth analysis of the populational dynamics behaviour considering their final approximation Pareto sets, anytime hypervolume performance, attained regions and number of unique non-dominated solutions. Our results indicate that MOEA/D with Partial Update progresses with the search as fast as MOEA/D with small population size and explores the search space as MOEA/D with big population size. MOEA/D with Partial Update can mitigate common problems related to population size choice with better convergence speed in most MOPs, as shown by the results of hypervolume and number of unique non-dominated solutions, the anytime performance and Empirical Attainment Function indicates.
Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling
Luo, Bing, Xiao, Wenli, Wang, Shiqiang, Huang, Jianwei, Tassiulas, Leandros
Federated learning (FL) algorithms usually sample a fraction of clients in each round (partial participation) when the number of participants is large and the server's communication bandwidth is limited. Recent works on the convergence analysis of FL have focused on unbiased client sampling, e.g., sampling uniformly at random, which suffers from slow wall-clock time for convergence due to high degrees of system heterogeneity and statistical heterogeneity. This paper aims to design an adaptive client sampling algorithm that tackles both system and statistical heterogeneity to minimize the wall-clock convergence time. We obtain a new tractable convergence bound for FL algorithms with arbitrary client sampling probabilities. Based on the bound, we analytically establish the relationship between the total learning time and sampling probabilities, which results in a non-convex optimization problem for training time minimization. We design an efficient algorithm for learning the unknown parameters in the convergence bound and develop a low-complexity algorithm to approximately solve the non-convex problem. Experimental results from both hardware prototype and simulation demonstrate that our proposed sampling scheme significantly reduces the convergence time compared to several baseline sampling schemes. Notably, our scheme in hardware prototype spends 73% less time than the uniform sampling baseline for reaching the same target loss.
Daily Digest
Understanding the interactions formed between a ligand and its molecular target is key to guiding the optimization of molecules. Different experimental and computational methods have been applied to better understanding these intermolecular interactions. Here researchers report a method based on geometric deep learning that is capable of predicting the binding conformations of ligands to protein targets. The model learns a statistical potential based on the distance likelihood, which is tailor-made for each ligand–target pair. This potential can be coupled with global optimization algorithms to reproduce the experimental binding conformations of ligands.
An Online Data-Driven Emergency-Response Method for Autonomous Agents in Unforeseen Situations
Maguire, Glenn, Ketz, Nicholas, Pilly, Praveen, Mouret, Jean-Baptiste
Reinforcement learning agents perform well when presented with inputs within the distribution of those encountered during training. However, they are unable to respond effectively when faced with novel, out-of-distribution events, until they have undergone additional training. This paper presents an online, data-driven, emergency-response method that aims to provide autonomous agents the ability to react to unexpected situations that are very different from those it has been trained or designed to address. In such situations, learned policies cannot be expected to perform appropriately since the observations obtained in these novel situations would fall outside the distribution of inputs that the agent has been optimized to handle. The proposed approach devises a customized response to the unforeseen situation sequentially, by selecting actions that minimize the rate of increase of the reconstruction error from a variational auto-encoder. This optimization is achieved online in a data-efficient manner (on the order of 30 data-points) using a modified Bayesian optimization procedure. We demonstrate the potential of this approach in a simulated 3D car driving scenario, in which the agent devises a response in under 2 seconds to avoid collisions with objects it has not seen during training.