Optimization
Fixed-Time Convergence for a Class of Nonconvex-Nonconcave Min-Max Problems
This study develops a fixed-time convergent saddle point dynamical system for solving min-max problems under a relaxation of standard convexity-concavity assumption. In particular, it is shown that by leveraging the dynamical systems viewpoint of an optimization algorithm, accelerated convergence to a saddle point can be obtained. Instead of requiring the objective function to be strongly-convex--strongly-concave (as necessitated for accelerated convergence of several saddle-point algorithms), uniform fixed-time convergence is guaranteed for functions satisfying only the two-sided Polyak-{\L}ojasiewicz (PL) inequality. A large number of practical problems, including the robust least squares estimation, are known to satisfy the two-sided PL inequality. The proposed method achieves arbitrarily fast convergence compared to any other state-of-the-art method with linear or even super-linear convergence, as also corroborated in numerical case studies.
SPINS: Structure Priors aided Inertial Navigation System
Lyu, Yang, Nguyen, Thien-Minh, Liu, Liu, Cao, Muqing, Yuan, Shenghai, Nguyen, Thien Hoang, Xie, Lihua
Although Simultaneous Localization and Mapping (SLAM) has been an active research topic for decades, current state-of-the-art methods still suffer from instability or inaccuracy due to feature insufficiency or its inherent estimation drift, in many civilian environments. To resolve these issues, we propose a navigation system combing the SLAM and prior-map-based localization. Specifically, we consider additional integration of line and plane features, which are ubiquitous and more structurally salient in civilian environments, into the SLAM to ensure feature sufficiency and localization robustness. More importantly, we incorporate general prior map information into the SLAM to restrain its drift and improve the accuracy. To avoid rigorous association between prior information and local observations, we parameterize the prior knowledge as low dimensional structural priors defined as relative distances/angles between different geometric primitives. The localization is formulated as a graph-based optimization problem that contains sliding-window-based variables and factors, including IMU, heterogeneous features, and structure priors. We also derive the analytical expressions of Jacobians of different factors to avoid the automatic differentiation overhead. To further alleviate the computation burden of incorporating structural prior factors, a selection mechanism is adopted based on the so-called information gain to incorporate only the most effective structure priors in the graph optimization. Finally, the proposed framework is extensively tested on synthetic data, public datasets, and, more importantly, on the real UAV flight data obtained from a building inspection task. The results show that the proposed scheme can effectively improve the accuracy and robustness of localization for autonomous robots in civilian applications.
Collaboration in Participant-Centric Federated Learning: A Game-Theoretical Perspective
Huang, Guangjing, Chen, Xu, Ouyang, Tao, Ma, Qian, Chen, Lin, Zhang, Junshan
Federated learning (FL) is a promising distributed framework for collaborative artificial intelligence model training while protecting user privacy. A bootstrapping component that has attracted significant research attention is the design of incentive mechanism to stimulate user collaboration in FL. The majority of works adopt a broker-centric approach to help the central operator to attract participants and further obtain a well-trained model. Few works consider forging participant-centric collaboration among participants to pursue an FL model for their common interests, which induces dramatic differences in incentive mechanism design from the broker-centric FL. To coordinate the selfish and heterogeneous participants, we propose a novel analytic framework for incentivizing effective and efficient collaborations for participant-centric FL. Specifically, we respectively propose two novel game models for contribution-oblivious FL (COFL) and contribution-aware FL (CAFL), where the latter one implements a minimum contribution threshold mechanism. We further analyze the uniqueness and existence for Nash equilibrium of both COFL and CAFL games and design efficient algorithms to achieve equilibrium solutions. Extensive performance evaluations show that there exists free-riding phenomenon in COFL, which can be greatly alleviated through the adoption of CAFL model with the optimized minimum threshold.
An Optimal Motion Planning Framework for Quadruped Jumping
Song, Zhitao, Yue, Linzhu, Sun, Guangli, Ling, Yihu, Wei, Hongshuo, Gui, Linhai, Liu, Yun-Hui
This paper presents an optimal motion planning framework to generate versatile energy-optimal quadrupedal jumping motions automatically (e.g., flips, spin). The jumping motions via the centroidal dynamics are formulated as a 12-dimensional black-box optimization problem subject to the robot kino-dynamic constraints. Gradient-based approaches offer great success in addressing trajectory optimization (TO), yet, prior knowledge (e.g., reference motion, contact schedule) is required and results in sub-optimal solutions. The new proposed framework first employed a heuristics-based optimization method to avoid these problems. Moreover, a prioritization fitness function is created for heuristics-based algorithms in robot ground reaction force (GRF) planning, enhancing convergence and searching performance considerably. Since heuristics-based algorithms often require significant time, motions are planned offline and stored as a pre-motion library. A selector is designed to automatically choose motions with user-specified or perception information as input. The proposed framework has been successfully validated only with a simple continuously tracking PD controller in an open-source Mini-Cheetah by several challenging jumping motions, including jumping over a window-shaped obstacle with 30 cm height and left-flipping over a rectangle obstacle with 27 cm height.
Optimizing Empty Container Repositioning and Fleet Deployment via Configurable Semi-POMDPs
Poiani, Riccardo, Stirbu, Ciprian, Metelli, Alberto Maria, Restelli, Marcello
With the continuous growth of the global economy and markets, resource imbalance has risen to be one of the central issues in real logistic scenarios. In marine transportation, this trade imbalance leads to Empty Container Repositioning (ECR) problems. Once the freight has been delivered from an exporting country to an importing one, the laden will turn into empty containers that need to be repositioned to satisfy new goods requests in exporting countries. In such problems, the performance that any cooperative repositioning policy can achieve strictly depends on the routes that vessels will follow (i.e., fleet deployment). Historically, Operation Research (OR) approaches were proposed to jointly optimize the repositioning policy along with the fleet of vessels. However, the stochasticity of future supply and demand of containers, together with black-box and non-linear constraints that are present within the environment, make these approaches unsuitable for these scenarios. In this paper, we introduce a novel framework, Configurable Semi-POMDPs, to model this type of problems. Furthermore, we provide a two-stage learning algorithm, "Configure & Conquer" (CC), that first configures the environment by finding an approximation of the optimal fleet deployment strategy, and then "conquers" it by learning an ECR policy in this tuned environmental setting. We validate our approach in large and real-world instances of the problem. Our experiments highlight that CC avoids the pitfalls of OR methods and that it is successful at optimizing both the ECR policy and the fleet of vessels, leading to superior performance in world trade environments.
Federated Semi-Supervised Domain Adaptation via Knowledge Transfer
Das, Madhureeta, Chen, Xianhao, Yuan, Xiaoyong, Zhang, Lan
Given the rapidly changing machine learning environments and expensive data labeling, semi-supervised domain adaptation (SSDA) is imperative when the labeled data from the source domain is statistically different from the partially labeled data from the target domain. Most prior SSDA research is centrally performed, requiring access to both source and target data. However, data in many fields nowadays is generated by distributed end devices. Due to privacy concerns, the data might be locally stored and cannot be shared, resulting in the ineffectiveness of existing SSDA research. This paper proposes an innovative approach to achieve SSDA over multiple distributed and confidential datasets, named by Federated Semi-Supervised Domain Adaptation (FSSDA). FSSDA integrates SSDA with federated learning based on strategically designed knowledge distillation techniques, whose efficiency is improved by performing source and target training in parallel. Moreover, FSSDA controls the amount of knowledge transferred across domains by properly selecting a key parameter, i.e., the imitation parameter. Further, the proposed FSSDA can be effectively generalized to multi-source domain adaptation scenarios. Extensive experiments are conducted to demonstrate the effectiveness and efficiency of FSSDA design.
Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions
Qiu, Shuang, Wei, Xiaohan, Ye, Jieping, Wang, Zhaoran, Yang, Zhuoran
While single-agent policy optimization in a fixed environment has attracted a lot of research attention recently in the reinforcement learning community, much less is known theoretically when there are multiple agents playing in a potentially competitive environment. We take steps forward by proposing and analyzing new fictitious play policy optimization algorithms for zero-sum Markov games with structured but unknown transitions. We consider two classes of transition structures: factored independent transition and single-controller transition. For both scenarios, we prove tight $\widetilde{\mathcal{O}}(\sqrt{K})$ regret bounds after $K$ episodes in a two-agent competitive game scenario. The regret of each agent is measured against a potentially adversarial opponent who can choose a single best policy in hindsight after observing the full policy sequence. Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization in a non-stationary environment. When both players adopt the proposed algorithms, their overall optimality gap is $\widetilde{\mathcal{O}}(\sqrt{K})$.
How do the Karush-Kuhn-Tucker Conditions work(Machine Learning)
The Karush-Kuhn-Tucker Conditions are a set of first derivative tests or also can be set of conditions. Abstract: This expository paper contains a concise introduction to some significant works concerning the Karush-Kuhn-Tucker condition, a necessary condition for a solution in local optimality in problems with equality and inequality constraints. The study of this optimality condition has a long history and culminated in the appearance of subdifferentials. The 1970s and early 1980s were important periods for new developments and various generalizations of subdifferentials were introduced, including the Clarke subdifferential and Demyanov-Rubinov quasidifferential. In this paper, we mainly present four generalized Karush-Kuhn-Tucker conditions or Fritz John conditions in variational analysis and set-valued analysis via Lagrange multiplier methods besides Fr$\Acute{e}$chet differentiable situation, namely subdifferentials of convex functions, generalized gradients of locally Lipschitz functions, quasidifferentials of quasidifferentiable functions and contingent epiderivatives of set-valued maps and discuss the limits of Lagrangian methods slightly in the last chapter.
Sequential Convex Programming for Collaboration of Connected and Automated Vehicles
Zhang, Xiaoxue, Ma, Jun, Cheng, Zilong, Lewis, Frank L., Lee, Tong Heng
This paper investigates the collaboration of multiple connected and automated vehicles (CAVs) in different scenarios. In general, the collaboration of CAVs can be formulated as a nonlinear and nonconvex model predictive control (MPC) problem. Most of the existing approaches available for utilization to solve such an optimization problem suffer from the drawback of considerable computational burden, which hinders the practical implementation in real time. This paper proposes the use of sequential convex programming (SCP), which is a powerful approach to solving the nonlinear and nonconvex MPC problem in real time. To appropriately deploy the methodology, as a first stage, SCP requires linearization and discretization when addressing the nonlinear dynamics of the system model adequately. Based on the linearization and discretization, the original MPC problem can be transformed into a quadratically constrained quadratic programming (QCQP) problem. Besides, SCP also involves convexification to handle the associated nonconvex constraints. Thus, the nonconvex QCQP can be reduced to a quadratic programming (QP) problem that can be solved rather quickly. Therefore, the computational efficiency is suitably improved despite the existence of nonlinear and nonconvex characteristics, whereby the implementation is realized in real time. Furthermore, simulation results in three different scenarios of autonomous driving are presented to validate the effectiveness and efficiency of our proposed approach.