Optimization
Computing Graph Edit Distance with Algorithms on Quantum Devices
Incudini, Massimiliano, Tarocco, Fabio, Mengoni, Riccardo, Di Pierro, Alessandra, Mandarino, Antonio
Distance measures provide the foundation for many popular algorithms in Machine Learning and Pattern Recognition. Different notions of distance can be used depending on the types of the data the algorithm is working on. For graph-shaped data, an important notion is the Graph Edit Distance (GED) that measures the degree of (dis)similarity between two graphs in terms of the operations needed to make them identical. As the complexity of computing GED is the same as NP-hard problems, it is reasonable to consider approximate solutions. In this paper we present a QUBO formulation of the GED problem. This allows us to implement two different approaches, namely quantum annealing and variational quantum algorithms that run on the two types of quantum hardware currently available: quantum annealer and gate-based quantum computer, respectively. Considering the current state of noisy intermediate-scale quantum computers, we base our study on proof-of-principle tests of their performance.
Adiabatic Quantum Computing for Multi Object Tracking
Zaech, Jan-Nico, Liniger, Alexander, Danelljan, Martin, Dai, Dengxin, Van Gool, Luc
Multi-Object Tracking (MOT) is most often approached in the tracking-by-detection paradigm, where object detections are associated through time. The association step naturally leads to discrete optimization problems. As these optimization problems are often NP-hard, they can only be solved exactly for small instances on current hardware. Adiabatic quantum computing (AQC) offers a solution for this, as it has the potential to provide a considerable speedup on a range of NP-hard optimization problems in the near future. However, current MOT formulations are unsuitable for quantum computing due to their scaling properties. In this work, we therefore propose the first MOT formulation designed to be solved with AQC. We employ an Ising model that represents the quantum mechanical system implemented on the AQC. We show that our approach is competitive compared with state-of-the-art optimization-based approaches, even when using of-the-shelf integer programming solvers. Finally, we demonstrate that our MOT problem is already solvable on the current generation of real quantum computers for small examples, and analyze the properties of the measured solutions.
Robust Multi-Objective Bayesian Optimization Under Input Noise
Daulton, Samuel, Cakmak, Sait, Balandat, Maximilian, Osborne, Michael A., Zhou, Enlu, Bakshy, Eytan
Bayesian optimization (BO) is a sample-efficient approach for tuning design parameters to optimize expensive-to-evaluate, black-box performance metrics. In many manufacturing processes, the design parameters are subject to random input noise, resulting in a product that is often less performant than expected. Although BO methods have been proposed for optimizing a single objective under input noise, no existing method addresses the practical scenario where there are multiple objectives that are sensitive to input perturbations. In this work, we propose the first multi-objective BO method that is robust to input noise. We formalize our goal as optimizing the multivariate value-at-risk (MVaR), a risk measure of the uncertain objectives. Since directly optimizing MVaR is computationally infeasible in many settings, we propose a scalable, theoretically-grounded approach for optimizing MVaR using random scalarizations. Empirically, we find that our approach significantly outperforms alternative methods and efficiently identifies optimal robust designs that will satisfy specifications across multiple metrics with high probability.
Optimal Resource Allocation using Python - Analytics Vidhya
This article was published as a part of the Data Science Blogathon. "True optimization is the revolutionary contribution of modern research to decision processes" – George Dantzig. We will find an optimal value for a linear equation with different linear constraints. We can solve linear programming using the PuLP library, and other Python libraries that can perform similar tasks are SciPy, Pyomo, CVXOPT. Let's discuss associated concepts and problem statements in detail.
L2C2: Locally Lipschitz Continuous Constraint towards Stable and Smooth Reinforcement Learning
This paper proposes a new regularization technique for reinforcement learning (RL) towards making policy and value functions smooth and stable. RL is known for the instability of the learning process and the sensitivity of the acquired policy to noise. Several methods have been proposed to resolve these problems, and in summary, the smoothness of policy and value functions learned mainly in RL contributes to these problems. However, if these functions are extremely smooth, their expressiveness would be lost, resulting in not obtaining the global optimal solution. This paper therefore considers RL under local Lipschitz continuity constraint, so-called L2C2. By designing the spatio-temporal locally compact space for L2C2 from the state transition at each time step, the moderate smoothness can be achieved without loss of expressiveness. Numerical noisy simulations verified that the proposed L2C2 outperforms the task performance while smoothing out the robot action generated from the learned policy.
A Reliability-aware Distributed Framework to Schedule Residential Charging of Electric Vehicles
Meyur, Rounak, Thorve, Swapna, Marathe, Madhav, Vullikanti, Anil, Swarup, Samarth, Mortveit, Henning
Residential consumers have become active participants in the power distribution network after being equipped with residential EV charging provisions. This creates a challenge for the network operator tasked with dispatching electric power to the residential consumers through the existing distribution network infrastructure in a reliable manner. In this paper, we address the problem of scheduling residential EV charging for multiple consumers while maintaining network reliability. An additional challenge is the restricted exchange of information: where the consumers do not have access to network information and the network operator does not have access to consumer load parameters. We propose a distributed framework which generates an optimal EV charging schedule for individual residential consumers based on their preferences and iteratively updates it until the network reliability constraints set by the operator are satisfied. We validate the proposed approach for different EV adoption levels in a synthetically created digital twin of an actual power distribution network. The results demonstrate that the new approach can achieve a higher level of network reliability compared to the case where residential consumers charge EVs based solely on their individual preferences, thus providing a solution for the existing grid to keep up with increased adoption rates without significant investments in increasing grid capacity.
Stochastic linear optimization never overfits with quadratically-bounded losses on general data
This work shows that a diverse collection of linear optimization methods, when run on general data, fail to overfit, despite lacking any explicit constraints or regularization: with high probability, their trajectories stay near the curve of optimal constrained solutions over the population distribution. This analysis is powered by an elementary but flexible proof scheme which can handle many settings, summarized as follows. Firstly, the data can be general: unlike other implicit bias works, it need not satisfy large margin or other structural conditions, and moreover can arrive sequentially IID, sequentially following a Markov chain, as a batch, and lastly it can have heavy tails. Secondly, while the main analysis is for mirror descent, rates are also provided for the Temporal-Difference fixed-point method from reinforcement learning; all prior high probability analyses in these settings required bounded iterates, bounded updates, bounded noise, or some equivalent. Thirdly, the losses are general, and for instance the logistic and squared losses can be handled simultaneously, unlike other implicit bias works. In all of these settings, not only is low population error guaranteed with high probability, but moreover low sample complexity is guaranteed so long as there exists any low-complexity near-optimal solution, even if the global problem structure and in particular global optima have high complexity.
A Newton-type algorithm for federated learning based on incremental Hessian eigenvector sharing
Fabbro, Nicolò Dal, Dey, Subhrakanti, Rossi, Michele, Schenato, Luca
There is a growing interest in the decentralized optimization framework that goes under the name of Federated Learning (FL). In particular, much attention is being turned to FL scenarios where the network is strongly heterogeneous in terms of communication resources (e.g., bandwidth) and data distribution. In these cases, communication between local machines (agents) and the central server (Master) is a main consideration. In this work, we present an original communication-constrained Newton-type (NT) algorithm designed to accelerate FL in such heterogeneous scenarios. The algorithm is by design robust to non i.i.d. data distributions, handles heterogeneity of agents' communication resources (CRs), only requires sporadic Hessian computations, and achieves super-linear convergence. This is possible thanks to an incremental strategy, based on a singular value decomposition (SVD) of the local Hessian matrices, which exploits (possibly) outdated second-order information. The proposed solution is thoroughly validated on real datasets by assessing (i) the number of communication rounds required for convergence, (ii) the overall amount of data transmitted and (iii) the number of local Hessian computations required. For all these metrics, the proposed approach shows superior performance against state-of-the art techniques like GIANT and FedNL.
Constrained Optimization with Dynamic Bound-scaling for Effective NLPBackdoor Defense
Shen, Guangyu, Liu, Yingqi, Tao, Guanhong, Xu, Qiuling, Zhang, Zhuo, An, Shengwei, Ma, Shiqing, Zhang, Xiangyu
We develop a novel optimization method for NLP backdoor inversion. We leverage a dynamically reducing temperature coefficient in the softmax function to provide changing loss landscapes to the optimizer such that the process gradually focuses on the ground truth trigger, which is denoted as a one-hot value in a convex hull. Our method also features a temperature rollback mechanism to step away from local optimals, exploiting the observation that local optimals can be easily determined in NLP trigger inversion (while not in general optimization). We evaluate the technique on over 1600 models (with roughly half of them having injected backdoors) on 3 prevailing NLP tasks, Figure 1: Difficult loss landscape with a low temperature with 4 different backdoor attacks and 7 architectures. Our results show that the technique is able simply extended to NLP models due to the discrete nature of to effectively and efficiently detect and remove these models.
Parallel Successive Learning for Dynamic Distributed Model Training over Heterogeneous Wireless Networks
Hosseinalipour, Seyyedali, Wang, Su, Michelusi, Nicolo, Aggarwal, Vaneet, Brinton, Christopher G., Love, David J., Chiang, Mung
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices, via iterative local updates (at devices) and global aggregations (at the server). In this paper, we develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions: (i) Network, allowing decentralized cooperation among the devices via device-to-device (D2D) communications. (ii) Heterogeneity, interpreted at three levels: (ii-a) Learning: PSL considers heterogeneous number of stochastic gradient descent iterations with different mini-batch sizes at the devices; (ii-b) Data: PSL presumes a dynamic environment with data arrival and departure, where the distributions of local datasets evolve over time, captured via a new metric for model/concept drift. (ii-c) Device: PSL considers devices with different computation and communication capabilities. (iii) Proximity, where devices have different distances to each other and the access point. PSL considers the realistic scenario where global aggregations are conducted with idle times in-between them for resource efficiency improvements, and incorporates data dispersion and model dispersion with local model condensation into FedL. Our analysis sheds light on the notion of cold vs. warmed up models, and model inertia in distributed machine learning. We then propose network-aware dynamic model tracking to optimize the model learning vs. resource efficiency tradeoff, which we show is an NP-hard signomial programming problem. We finally solve this problem through proposing a general optimization solver. Our numerical results reveal new findings on the interdependencies between the idle times in-between the global aggregations, model/concept drift, and D2D cooperation configuration.