Optimization
Learning Underspecified Models
Cho, In-Koo, Libgober, Jonathan
This paper examines whether one can learn to play an optimal action while only knowing part of true specification of the environment. We choose the optimal pricing problem as our laboratory, where the monopolist is endowed with an underspecified model of the market demand, but can observe market outcomes. In contrast to conventional learning models where the model specification is complete and exogenously fixed, the monopolist has to learn the specification and the parameters of the demand curve from the data. We formulate the learning dynamics as an algorithm that forecast the optimal price based on the data, following the machine learning literature (Shalev-Shwartz and Ben-David (2014)). Inspired by PAC learnability, we develop a new notion of learnability by requiring that the algorithm must produce an accurate forecast with a reasonable amount of data uniformly over the class of models consistent with the part of the true specification. In addition, we assume that the monopolist has a lexicographic preference over the payoff and the complexity cost of the algorithm, seeking an algorithm with a minimum number of parameters subject to PAC-guaranteeing the optimal solution (Rubinstein (1986)). We show that for the set of demand curves with strictly decreasing uniformly Lipschitz continuous marginal revenue curve, the optimal algorithm recursively estimates the slope and the intercept of the linear demand curve, even if the actual demand curve is not linear. The monopolist chooses a misspecified model to save computational cost, while learning the true optimal decision uniformly over the set of underspecified demand curves.
Solving the Batch Stochastic Bin Packing Problem in Cloud: A Chance-constrained Optimization Approach
Yan, Jie, Lu, Yunlei, Chen, Liting, Qin, Si, Fang, Yixin, Lin, Qingwei, Moscibroda, Thomas, Rajmohan, Saravan, Zhang, Dongmei
This paper investigates a critical resource allocation problem in the first party cloud: scheduling containers to machines. There are tens of services and each service runs a set of homogeneous containers with dynamic resource usage; containers of a service are scheduled daily in a batch fashion. This problem can be naturally formulated as Stochastic Bin Packing Problem (SBPP). However, traditional SBPP research often focuses on cases of empty machines, whose objective, i.e., to minimize the number of used machines, is not well-defined for the more common reality with nonempty machines. This paper aims to close this gap. First, we define a new objective metric, Used Capacity at Confidence (UCaC), which measures the maximum used resources at a probability and is proved to be consistent for both empty and nonempty machines, and reformulate the SBPP under chance constraints. Second, by modeling the container resource usage distribution in a generative approach, we reveal that UCaC can be approximated with Gaussian, which is verified by trace data of real-world applications. Third, we propose an exact solver by solving the equivalent cutting stock variant as well as two heuristics-based solvers -- UCaC best fit, bi-level heuristics. We experimentally evaluate these solvers on both synthetic datasets and real application traces, demonstrating our methodology's advantage over traditional SBPP optimal solver minimizing the number of used machines, with a low rate of resource violations.
Downwash-aware Control Allocation for Over-actuated UAV Platforms
Su, Yao, Chu, Chi, Wang, Meng, Li, Jiarui, Yang, Liu, Zhu, Yixin, Liu, Hangxin
Abstract-- Tracking position and orientation independently affords more agile maneuver for over-actuated multirotor Unmanned Aerial Vehicles (UAVs) while introducing undesired downwash effects; downwash flows generated by thrust generators may counteract others due to close proximity, which significantly threatens the stability of the platform. The complexity of modeling aerodynamic airflow challenges control (a) Conventional control allocation framework algorithms from properly compensating for such a side effect. Leveraging the input redundancies in over-actuated UAVs, we tackle this issue with a novel control allocation framework that considers downwash effects and explores the entire allocation space for an optimal solution. This optimal solution avoids downwash effects while providing high thrust efficiency within the hardware constraints. To the best of our knowledge, ours is (b) Proposed downwash-aware control allocation framework the first formal derivation to investigate the downwash effects on over-actuated UAVs.
Real-Time Trajectory Planning for Aerial Perching
Ji, Jialin, Yang, Tiankai, Xu, Chao, Gao, Fei
This paper presents a novel trajectory planning method for aerial perching. Compared with the existing work, the terminal states and the trajectory durations can be adjusted adaptively, instead of being determined in advance. Furthermore, our planner is able to minimize the tangential relative speed on the premise of safety and dynamic feasibility. This feature is especially notable on micro aerial robots with low maneuverability or scenarios where the space is not enough. Moreover, we design a flexible transformation strategy to eliminate terminal constraints along with reducing optimization variables. Besides, we take precise SE(3) motion planning into account to ensure that the drone would not touch the landing platform until the last moment. The proposed method is validated onboard by a palm-sized micro aerial robot with quite limited thrust and moment (thrust-to-weight ratio 1.7) perching on a mobile inclined surface. Sufficient experimental results show that our planner generates an optimal trajectory within 20ms, and replans with warm start in 2ms.
Actor-Critic based Improper Reinforcement Learning
Zaki, Mohammadi, Mohan, Avinash, Gopalan, Aditya, Mannor, Shie
We consider an improper reinforcement learning setting where a learner is given $M$ base controllers for an unknown Markov decision process, and wishes to combine them optimally to produce a potentially new controller that can outperform each of the base ones. This can be useful in tuning across controllers, learnt possibly in mismatched or simulated environments, to obtain a good controller for a given target environment with relatively few trials. Towards this, we propose two algorithms: (1) a Policy Gradient-based approach; and (2) an algorithm that can switch between a simple Actor-Critic (AC) based scheme and a Natural Actor-Critic (NAC) scheme depending on the available information. Both algorithms operate over a class of improper mixtures of the given controllers. For the first case, we derive convergence rate guarantees assuming access to a gradient oracle. For the AC-based approach we provide convergence rate guarantees to a stationary point in the basic AC case and to a global optimum in the NAC case. Numerical results on (i) the standard control theoretic benchmark of stabilizing an cartpole; and (ii) a constrained queueing task show that our improper policy optimization algorithm can stabilize the system even when the base policies at its disposal are unstable.
Multi-parametric Analysis for Mixed Integer Linear Programming: An Application to Transmission Planning and Congestion Control
Liu, Jian, Bo, Rui, Wang, Siyuan
Enhancing existing transmission lines is a useful tool to combat transmission congestion and guarantee transmission security with increasing demand and boosting the renewable energy source. This study concerns the selection of lines whose capacity should be expanded and by how much from the perspective of independent system operator (ISO) to minimize the system cost with the consideration of transmission line constraints and electricity generation and demand balance conditions, and incorporating ramp-up and startup ramp rates, shutdown ramp rates, ramp-down rate limits and minimum up and minimum down times. For that purpose, we develop the ISO unit commitment and economic dispatch model and show it as a right-hand side uncertainty multiple parametric analysis for the mixed integer linear programming (MILP) problem. We first relax the binary variable to continuous variables and employ the Lagrange method and Karush-Kuhn-Tucker conditions to obtain optimal solutions (optimal decision variables and objective function) and critical regions associated with active and inactive constraints. Further, we extend the traditional branch and bound method for the large-scale MILP problem by determining the upper bound of the problem at each node, then comparing the difference between the upper and lower bounds and reaching the approximate optimal solution within the decision makers' tolerated error range. In additional, the objective function's first derivative on the parameters of each line is used to inform the selection of lines to ease congestion and maximize social welfare. Finally, the amount of capacity upgrade will be chosen by balancing the cost-reduction rate of the objective function on parameters and the cost of the line upgrade. Our findings are supported by numerical simulation and provide transmission line planners with decision-making guidance.
FORML: Learning to Reweight Data for Fairness
Yan, Bobby, Seto, Skyler, Apostoloff, Nicholas
Machine learning models are trained to minimize the mean loss for a single metric, and thus typically do not consider fairness and robustness. Neglecting such metrics in training can make these models prone to fairness violations when training data are imbalanced or test distributions differ. This work introduces Fairness Optimized Reweighting via Meta-Learning (FORML), a training algorithm that balances fairness and robustness with accuracy by jointly learning training sample weights and neural network parameters. The approach increases model fairness by learning to balance the contributions from both over- and under-represented sub-groups through dynamic reweighting of the data learned from a user-specified held-out set representative of the distribution under which fairness is desired. FORML improves equality of opportunity fairness criteria on image classification tasks, reduces bias of corrupted labels, and facilitates building more fair datasets via data condensation. These improvements are achieved without pre-processing data or post-processing model outputs, without learning an additional weighting function, without changing model architecture, and while maintaining accuracy on the original predictive metric.
Finding and Following Optimal Trajectories for an Overactuated Floating Robotic Platform
Bredenbeck, Anton, Vyas, Shubham, Suter, Willem, Zwick, Martin, Borrmann, Dorit, Olivares-Mendez, Miguel, Nüchter, Andreas
The recent increase in yearly spacecraft launches and the high number of planned launches have raised questions about maintaining accessibility to space for all interested parties. A key to sustaining the future of space-flight is the ability to service malfunctioning - and actively remove dysfunctional spacecraft from orbit. Robotic platforms that autonomously perform these tasks are a topic of ongoing research and thus must undergo thorough testing before launch. For representative system-level testing, the European Space Agency (ESA) uses, among other things, the Orbital Robotics and GNC Lab (ORGL), a flat-floor facility where air-bearing based platforms exhibit free-floating behavior in three Degrees of Freedom (DoF). This work introduces a representative simulation of a free-floating platform in the testing environment and a software framework for controller development. Finally, this work proposes a controller within that framework for finding and following optimal trajectories between arbitrary states, which is evaluated in simulation and reality.
Adaptive Learning for the Resource-Constrained Classification Problem
Abukasis, Danit Shifman, Cohen, Izack, Xian, Xiaochen, Huang, Kejun, Singer, Gonen
Classification applications are typically associated with misclassification costs and benefits as a result of incorrect and correct classification, respectively. Many studies have focused on cost-sensitive classification approaches [7, 8, 9, 10, 11, 12] in an effort to reduce the costs of misclassification. We illustrate the concept of imbalanced misclassification costs using the current and real-world example of classifying COVID-19 patients. Incorrectly classifying an ill patient as healthy may put this patient's life at risk as well as others by allowing the ill person to circulate among healthy persons and infect them (an intangible cost, usually determined by the judicial system). Classifying a healthy individual as a COVID-19 patient, on the other hand, may lead to unnecessary treatment, misuse of medical resources and cause unnecessary financial hardship to the individual and the general economy. Many studies have applied cost-sensitive approaches to handling imbalanced classification problems [13, 14] where the decision maker is interested in detecting the positive cases. There are four main approaches for making a classifier cost-sensitive: (i) changing the distribution of classes using over-and under-sampling within the training data set (i.e., preprocessing of the training data) to reduce misclassification costs [7, 8], denoted hereafter approach A1; (ii) changing the data set according to the misclassified samples of the cost-insensitive classifiers and their error costs (post-processing the training data) using a boosting approach in ensemble learning methods [12, 15], denoted hereafter approach A2; (iii) incorporating meta-learning methods on outputs of cost-insensitive learners using threshold driven techniques in favor of utilizing the probability estimations for the classes [7, 8, 16, 17], hereafter denoted A3; (iv) directly incorporating cost-sensitive capabilities into a learning algorithm, i.e., an algorithm-level solution that adapts existing learning methods so they are biased towards classes with high misclassification costs, usually presented by minority classes [8, 18].
Application of QUBO solver using black-box optimization to structural design for resonance avoidance
Matsumori, Tadayoshi, Taki, Masato, Kadowaki, Tadashi
Quadratic unconstrained binary optimization (QUBO) solvers can be applied to design an optimal structure to avoid resonance. QUBO algorithms that work on a classical or quantum device have succeeded in some industrial applications. However, their applications are still limited due to the difficulty of transforming from the original optimization problem to QUBO. Recently, black-box optimization (BBO) methods have been proposed to tackle this issue using a machine learning technique and a Bayesian treatment for combinatorial optimization. We employed the BBO methods to design a printed circuit board for resonance avoidance. This design problem is formulated to maximize natural frequency and simultaneously minimize the number of mounting points. The natural frequency, which is the bottleneck for the QUBO formulation, is approximated to a quadratic model in the BBO method. We demonstrated that BBO using a factorization machine shows good performance in both the calculation time and the success probability of finding the optimal solution. Our results can open up QUBO solvers' potential for other applications in structural designs.