Optimization
A Graph-Partitioning Based Continuous Optimization Approach to Semi-supervised Clustering Problems
Liu, Wei, Liu, Xin, Ng, Michael K., Zhang, Zaikun
Semi-supervised clustering is a basic problem in various applications. Most existing methods require knowledge of the ideal cluster number, which is often difficult to obtain in practice. Besides, satisfying the must-link constraints is another major challenge for these methods. In this work, we view the semi-supervised clustering task as a partitioning problem on a graph associated with the given dataset, where the similarity matrix includes a scaling parameter to reflect the must-link constraints. Utilizing a relaxation technique, we formulate the graph partitioning problem into a continuous optimization model that does not require the exact cluster number, but only an overestimate of it. We then propose a block coordinate descent algorithm to efficiently solve this model, and establish its convergence result. Based on the obtained solution, we can construct the clusters that theoretically meet the must-link constraints under mild assumptions. Furthermore, we verify the effectiveness and efficiency of our proposed method through comprehensive numerical experiments.
Provable Robust Overfitting Mitigation in Wasserstein Distributionally Robust Optimization
Liu, Shuang, Wang, Yihan, Zhu, Yifan, Miao, Yibo, Gao, Xiao-Shan
Wasserstein distributionally robust optimization (WDRO) optimizes against worst-case distributional shifts within a specified uncertainty set, leading to enhanced generalization on unseen adversarial examples, compared to standard adversarial training which focuses on pointwise adversarial perturbations. However, WDRO still suffers fundamentally from the robust overfitting problem, as it does not consider statistical error. We address this gap by proposing a novel robust optimization framework under a new uncertainty set for adversarial noise via Wasserstein distance and statistical error via Kullback-Leibler divergence, called the Statistically Robust WDRO. We establish a robust generalization bound for the new optimization framework, implying that out-of-distribution adversarial performance is at least as good as the statistically robust training loss with high probability. Furthermore, we derive conditions under which Stackelberg and Nash equilibria exist between the learner and the adversary, giving an optimal robust model in certain sense. Finally, through extensive experiments, we demonstrate that our method significantly mitigates robust overfitting and enhances robustness within the framework of WDRO. Optimization in machine learning is often challenged by data ambiguity caused by natural noise, adversarial attacks (Goodfellow et al., 2014), or other distributional shifts (Malinin et al., 2021; 2022). To develop robust models against data ambiguity, Wasserstein Distributionally Robust Optimization (WDRO) (Kuhn et al., 2019) has emerged as a powerful modeling framework, which has recently attracted significant attention attributed to its connections to generalization and robustness (Lee & Raginsky, 2018; Mohajerin Esfahani & Kuhn, 2018; An & Gao, 2021). From both intuitive and theoretical perspectives, WDRO can be considered as a more comprehensive framework that subsumes and extends adversarial training (Staib & Jegelka, 2017; Sinha et al., 2017; Bui et al., 2022; Phan et al., 2023). In contrast to standard adversarial training (Madry et al., 2017), WDRO operates on a broader scale by perturbing the entire data distribution rather than pointwise adversarial samples.
Incorporating Surrogate Gradient Norm to Improve Offline Optimization Techniques
Dao, Manh Cuong, Nguyen, Phi Le, Truong, Thao Nguyen, Hoang, Trong Nghia
Offline optimization has recently emerged as an increasingly popular approach to mitigate the prohibitively expensive cost of online experimentation. The key idea is to learn a surrogate of the black-box function that underlines the target experiment using a static (offline) dataset of its previous input-output queries. Such an approach is, however, fraught with an out-of-distribution issue where the learned surrogate becomes inaccurate outside the offline data regimes. To mitigate this, existing offline optimizers have proposed numerous conditioning techniques to prevent the learned surrogate from being too erratic. Nonetheless, such conditioning strategies are often specific to particular surrogate or search models, which might not generalize to a different model choice. This motivates us to develop a model-agnostic approach instead, which incorporates a notion of model sharpness into the training loss of the surrogate as a regularizer. Our approach is supported by a new theoretical analysis demonstrating that reducing surrogate sharpness on the offline dataset provably reduces its generalized sharpness on unseen data. Our analysis extends existing theories from bounding generalized prediction loss (on unseen data) with loss sharpness to bounding the worst-case generalized surrogate sharpness with its empirical estimate on training data, providing a new perspective on sharpness regularization. Our extensive experimentation on a diverse range of optimization tasks also shows that reducing surrogate sharpness often leads to significant improvement, marking (up to) a noticeable 9.6% performance boost. Our code is publicly available at https://github.com/cuong-dm/IGNITE
FUSE: First-Order and Second-Order Unified SynthEsis in Stochastic Optimization
Jiang, Zhanhong, Hasan, Md Zahid, Balu, Aditya, Waite, Joshua R., Huang, Genyi, Sarkar, Soumik
Stochastic optimization methods have actively been playing a critical role in modern machine learning algorithms to deliver decent performance. While numerous works have proposed and developed diverse approaches, first-order and second-order methods are in entirely different situations. The former is significantly pivotal and dominating in emerging deep learning but only leads convergence to a stationary point. However, second-order methods are less popular due to their computational intensity in large-dimensional problems. This paper presents a novel method that leverages both the first-order and second-order methods in a unified algorithmic framework, termed FUSE, from which a practical version (PV) is derived accordingly. FUSE-PV stands as a simple yet efficient optimization method involving a switch-over between first and second orders. Additionally, we develop different criteria that determine when to switch. FUSE-PV has provably shown a smaller computational complexity than SGD and Adam. To validate our proposed scheme, we present an ablation study on several simple test functions and show a comparison with baselines for benchmark datasets.
Fusion of Various Optimization Based Feature Smoothing Methods for Wearable and Non-invasive Blood Glucose Estimation
Wei, Yiting, Ling, Bingo Wing-Kuen, Chen, Danni, Dai, Yuheng, Liu, Qing
Recently, the wearable and non-invasive blood glucose estimation approach has been proposed. However, due to the unreliability of the acquisition device, the presence of the noise and the variations of the acquisition environments, the obtained features and the reference blood glucose values are highly unreliable. To address this issue, this paper proposes a polynomial fitting approach to smooth the obtained features or the reference blood glucose values. First, the blood glucose values are estimated based on the individual optimization approaches. Second, the absolute difference values between the estimated blood glucose values and the actual blood glucose values based on each optimization approach are computed. Third, these absolute difference values for each optimization approach are sorted in the ascending order. Fourth, for each sorted blood glucose value, the optimization method corresponding to the minimum absolute difference value is selected. Fifth, the accumulate probability of each selected optimization method is computed. If the accumulate probability of any selected optimization method at a point is greater than a threshold value, then the accumulate probabilities of these three selected optimization methods at that point are reset to zero. A range of the sorted blood glucose values are defined as that with the corresponding boundaries points being the previous reset point and this reset point. Hence, after performing the above procedures for all the sorted reference blood glucose values in the validation set, the regions of the sorted reference blood glucose values and the corresponding optimization methods in these regions are determined. The computer numerical simulation results show that our proposed method yields the mean absolute relative deviation (MARD) at 0.0930 and the percentage of the test data falling in the zone A of the Clarke error grid at 94.1176%.
Navigating Intelligence: A Survey of Google OR-Tools and Machine Learning for Global Path Planning in Autonomous Vehicles
Benoit, Alexandre, Asef, Pedram
We offer a new in-depth investigation of global path planning (GPP) for unmanned ground vehicles, an autonomous mining sampling robot named ROMIE. GPP is essential for ROMIE's optimal performance, which is translated into solving the traveling salesman problem, a complex graph theory challenge that is crucial for determining the most effective route to cover all sampling locations in a mining field. This problem is central to enhancing ROMIE's operational efficiency and competitiveness against human labor by optimizing cost and time. The primary aim of this research is to advance GPP by developing, evaluating, and improving a cost-efficient software and web application. We delve into an extensive comparison and analysis of Google operations research (OR)-Tools optimization algorithms. Our study is driven by the goal of applying and testing the limits of OR-Tools capabilities by integrating Reinforcement Learning techniques for the first time. This enables us to compare these methods with OR-Tools, assessing their computational effectiveness and real-world application efficiency. Our analysis seeks to provide insights into the effectiveness and practical application of each technique. Our findings indicate that Q-Learning stands out as the optimal strategy, demonstrating superior efficiency by deviating only 1.2% on average from the optimal solutions across our datasets.
Endpoint-Explicit Differential Dynamic Programming via Exact Resolution
Parilli, Maria, Martinez, Sergi, Mastalli, Carlos
We introduce a novel method for handling endpoint constraints in constrained differential dynamic programming (DDP). Unlike existing approaches, our method guarantees quadratic convergence and is exact, effectively managing rank deficiencies in both endpoint and stagewise equality constraints. It is applicable to both forward and inverse dynamics formulations, making it particularly well-suited for model predictive control (MPC) applications and for accelerating optimal control (OC) solvers. We demonstrate the efficacy of our approach across a broad range of robotics problems and provide a user-friendly open-source implementation within CROCODDYL.
Direct Sparse Odometry with Continuous 3D Gaussian Maps for Indoor Environments
Deng, Jie, Lang, Fengtian, Yuan, Zikang, Yang, Xin
Direct Sparse Odometry with Continuous 3D Gaussian Maps for Indoor Environments Jie Deng 1, Fengtian Lang 1, Zikang Y uan 2 and Xin Y ang 1 Abstract -- Accurate localization is essential for robotics and augmented reality applications such as autonomous navigation. Vision-based methods combining prior maps aim to integrate LiDAR-level accuracy with camera cost efficiency for robust pose estimation. Existing approaches, however, often depend on unreliable interpolation procedures when associating discrete point cloud maps with dense image pixels, which inevitably introduces depth errors and degrades pose estimation accuracy. We propose a monocular visual odometry framework utilizing a continuous 3D Gaussian map, which directly assigns geometrically consistent depth values to all extracted high-gradient points without interpolation. Evaluations on two public datasets demonstrate superior tracking accuracy compared to existing methods. We have released the source code of this work for the development of the community. I NTRODUCTION Visual odometry (VO)/visual-inertial odometry (VIO) is a crucial capability in a wide range of technologies, including robotics, unmanned aerial vehicles and mixed reality.
Adaptive Negative Damping Control for User-Dependent Multi-Terrain Walking Assistance with a Hip Exoskeleton
Ramella, Giulia, Ijspeert, Auke, Bouri, Mohamed
Adaptive Negative Damping Control for User-Dependent Multi-T errain Walking Assistance with a Hip Exoskeleton Giulia Ramella 1, 2, Auke Ijspeert 2, and Mohamed Bouri 1, 3 Abstract -- Hip exoskeletons are known for their versatility in assisting users across varied scenarios. However, current assistive strategies often lack the flexibility to accommodate for individual walking patterns and adapt to diverse locomotion environments. In this work, we present a novel control strategy that adapts the mechanical impedance of the human-exoskeleton system. We design the hip assistive torques as an adaptive virtual negative damping, which is able to inject energy into the system while allowing the users to remain in control and contribute voluntarily to the movements. Experiments with five healthy subjects demonstrate that our controller reduces the metabolic cost of walking compared to free walking (average reduction of 7 . Additionally, our method achieves minimal power losses from the exoskeleton across the entire gait cycle (less than 2% negative mechanical power out of the total power), ensuring synchronized action with the users' movements. Moreover, we use Bayesian Optimization to adapt the assistance strength and allow for seamless adaptation and transitions across multi-terrain environments. Our strategy achieves efficient power transmission under all conditions. Our approach demonstrates an individualized, adaptable, and straightforward controller for hip exoskeletons, advancing the development of viable, adaptive, and user-dependent control laws.
Towards Resilient and Sustainable Global Industrial Systems: An Evolutionary-Based Approach
Jirkovský, Václav, Kubalík, Jiří, Kadera, Petr, Schirrmann, Arnd, Mitschke, Andreas, Zindel, Andreas
This paper presents a new complex optimization problem in the field of automatic design of advanced industrial systems and proposes a hybrid optimization approach to solve the problem. The problem is multi-objective as it aims at finding solutions that minimize CO2 emissions, transportation time, and costs. The optimization approach combines an evolutionary algorithm and classical mathematical programming to design resilient and sustainable global manufacturing networks. Further, it makes use of the OWL ontology for data consistency and constraint management. The experimental validation demonstrates the effectiveness of the approach in both single and double sourcing scenarios. The proposed methodology, in general, can be applied to any industry case with complex manufacturing and supply chain challenges.