Optimization
Adaptive Moment Estimation Optimization Algorithm Using Projection Gradient for Deep Learning
Training deep neural networks is challenging. To accelerate training and enhance performance, we propose PadamP, a novel optimization algorithm. PadamP is derived by applying the adaptive estimation of the p-th power of the second-order moments under scale invariance, enhancing projection adaptability by modifying the projection discrimination condition. It is integrated into Adam-type algorithms, accelerating training, boosting performance, and improving generalization in deep learning. Combining projected gradient benefits with adaptive moment estimation, PadamP tackles unconstrained non-convex problems. Convergence for the non-convex case is analyzed, focusing on the decoupling of first-order moment estimation coefficients and second-order moment estimation coefficients. Unlike prior work relying on , our proof generalizes the convergence theorem, enhancing practicality. Experiments using VGG-16 and ResNet-18 on CIFAR-10 and CIFAR-100 show PadamP's effectiveness, with notable performance on CIFAR-10/100, especially for VGG-16. The results demonstrate that PadamP outperforms existing algorithms in terms of convergence speed and generalization ability, making it a valuable addition to the field of deep learning optimization.
Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach
She, Ruifeng, Pang, Bowen, Li, Kai, Liu, Zehua, Zhong, Tao
As the artificial intelligence community advances into the era of large models with billions of parameters, distributed training and inference have become essential. While various parallelism strategies-data, model, sequence, and pipeline-have been successfully implemented for popular neural networks on main-stream hardware, optimizing the distributed deployment schedule requires extensive expertise and manual effort. Further more, while existing frameworks with most simple chain-like structures, they struggle with complex non-linear architectures. Mixture-of-experts and multi-modal models feature intricate MIMO and branch-rich topologies that require fine-grained operator-level parallelization beyond the capabilities of existing frameworks. We propose formulating parallelism planning as a scheduling optimization problem using mixed-integer programming. We propose a bi-level solution framework balancing optimality with computational efficiency, automatically generating effective distributed plans that capture both the heterogeneous structure of modern neural networks and the underlying hardware constraints. In experiments comparing against expert-designed strategies like DeepSeek's DualPipe, our framework achieves comparable or superior performance, reducing computational bubbles by half under the same memory constraints. The framework's versatility extends beyond throughput optimization to incorporate hardware utilization maximization, memory capacity constraints, and other considerations or potential strategies. Such capabilities position our solution as both a valuable research tool for exploring optimal parallelization strategies and a practical industrial solution for large-scale AI deployment.
MonoSLAM: Robust Monocular SLAM with Global Structure Optimization
Jiang, Bingzheng, Wang, Jiayuan, Ding, Han, Zhu, Lijun
This paper presents a robust monocular visual SLAM system that simultaneously utilizes point, line, and vanishing point features for accurate camera pose estimation and mapping. To address the critical challenge of achieving reliable localization in low-texture environments, where traditional point-based systems often fail due to insufficient visual features, we introduce a novel approach leveraging Global Primitives structural information to improve the system's robustness and accuracy performance. Our key innovation lies in constructing vanishing points from line features and proposing a weighted fusion strategy to build Global Primitives in the world coordinate system. This strategy associates multiple frames with non-overlapping regions and formulates a multi-frame reprojection error optimization, significantly improving tracking accuracy in texture-scarce scenarios. Evaluations on various datasets show that our system outperforms state-of-the-art methods in trajectory precision, particularly in challenging environments.
Drift-Aware Federated Learning: A Causal Perspective
Fang, Yunjie, Wu, Sheng, Yang, Tao, Wu, Xiaofeng, Hu, Bo
Federated learning (FL) facilitates collaborative model training among multiple clients while preserving data privacy, often resulting in enhanced performance compared to models trained by individual clients. However, factors such as communication frequency and data distribution can contribute to feature drift, hindering the attainment of optimal training performance. This paper examine the relationship between model update drift and global as well as local optimizer from causal perspective. The influence of the global optimizer on feature drift primarily arises from the participation frequency of certain clients in server updates, whereas the effect of the local optimizer is typically associated with imbalanced data distributions.To mitigate this drift, we propose a novel framework termed Causal drift-Aware Federated lEarning (CAFE). CAFE exploits the causal relationship between feature-invariant components and classification outcomes to independently calibrate local client sample features and classifiers during the training phase. In the inference phase, it eliminated the drifts in the global model that favor frequently communicating clients.Experimental results demonstrate that CAFE's integration of feature calibration, parameter calibration, and historical information effectively reduces both drift towards majority classes and tendencies toward frequently communicating nodes.
Mobility-aware Seamless Service Migration and Resource Allocation in Multi-edge IoV Systems
Chen, Zheyi, Huang, Sijin, Min, Geyong, Ning, Zhaolong, Li, Jie, Zhang, Yan
Abstract--Mobile Edge Computing (MEC) offers low-latency and high-bandwidth support for Internet-of-Vehicles (IoV) applications. However, due to high vehicle mobility and finite communication coverage of base stations, it is hard to maintain uninterrupted and high-quality services without proper service migration among MEC servers. Existing solutions commonly rely on prior knowledge and rarely consider efficient resource allocation during the service migration process, making it hard to reach optimal performance in dynamic IoV environments. To address these important challenges, we propose SR-CL, a novel mobility-aware seamless Service migration and Resource allocation framework via Convex-optimization-enabled deep reinforcement Learning in multi-edge IoV systems. First, we decouple the Mixed Integer Nonlinear Programming (MINLP) problem of service migration and resource allocation into two sub-problems. Next, we design a new actor-critic-based asynchronous-update deep reinforcement learning method to handle service migration, where the delayed-update actor makes migration decisions and the one-step-update critic evaluates the decisions to guide the policy update. Notably, we theoretically derive the optimal resource allocation with convex optimization for each MEC server, thereby further improving system performance. Using the real-world datasets of vehicle trajectories and testbed, extensive experiments are conducted to verify the effectiveness of the proposed SR-CL. Compared to benchmark methods, the SR-CL achieves superior convergence and delay performance under various scenarios. However, the real-time demands of IoV applications pose When vehicles offload tasks, MEC servers create dedicated significant challenges for onboard processors with limited service instances via virtualization techniques for the vehicles computational capabilities [2]. Although Cloud Computing and allocate proper resources to them [7].
An Unsupervised C-Uniform Trajectory Sampler with Applications to Model Predictive Path Integral Control
Poyrazoglu, O. Goktug, Moorthy, Rahul, Cao, Yukang, Chastek, William, Isler, Volkan
-- Sampling-based model predictive controllers generate trajectories by sampling control inputs from a fixed, simple distribution such as the normal or uniform distributions. This sampling method yields trajectory samples that are tightly clustered around a mean trajectory. This clustering behavior in turn, limits the exploration capability of the controller and reduces the likelihood of finding feasible solutions in complex environments. Recent work has attempted to address this problem by either reshaping the resulting trajectory distribution or increasing the sample entropy to enhance diversity and promote exploration. In our recent work, we introduced the concept of C-Uniform trajectory generation [1] which allows the computation of control input probabilities to generate trajectories that sample the configuration space uniformly. We introduce Neural C-Uniform, an unsupervised C-Uniform trajectory sampler that mitigates scalability issues by computing control input probabilities without relying on a discretized configuration space. Experiments show that Neural C-Uniform achieves a similar uniformity ratio to the original C-Uniform approach and generates trajectories over a longer time horizon while preserving uniformity. We analyze the performance of CU-MPPI in simulation and real-world experiments. Our results indicate that in settings where the optimal solution has high curvature, CU-MPPI leads to drastic improvements in performance. Sampling-based model predictive controllers generate "minimum cost" trajectories using a set of trajectory samples to achieve objectives such as arriving at a goal location while avoiding obstacles and adhering to motion constraints.
Geometric Data-Driven Multi-Jet Locomotion Inspired by Salps
Yang, Yanhao, Hecht, Nina L., Salaman-Maclara, Yousef, Justus, Nathan, Thomas, Zachary A., Rozaidi, Farhan, Hatton, Ross L.
--Salps are marine animals consisting of chains of jellyfish-like units. Their capacity for effective underwater undulatory locomotion through coordinating multi-jet propulsion has aroused significant interest in the field of robotics and inspired extensive research including design, modeling, and control. In this paper, we conduct a comprehensive analysis of the locomotion of salp-like systems using the robotic platform "LandSalp" based on geometric mechanics, including mechanism design, dynamic modeling, system identification, and motion planning and control. Our work takes a step toward a better understanding of salps' underwater locomotion and provides a clear path for extending these insights to more complex and capable underwater robotic systems. Furthermore, this study illustrates the effectiveness of geometric mechanics in bio-inspired robots for efficient data-driven locomotion modeling, demonstrated by learning the dynamics of LandSalp from only 3 minutes of experimental data. Lastly, we extend the geometric mechanics principles to multi-jet propulsion systems with stability considerations and validate the theory through experiments on the LandSalp hardware. These creatures are capable of efficient underwater undulatory locomotion by coordinating multi-jet propulsion. The structure and locomotion patterns of salps are closely related, which has attracted widespread interest in both biological and ecological research [1-5]. In the field of robotics, salps have attracted increasing attention due to their jet propulsion by expelling water through contraction, efficient underwater locomotion, and multi-unit coordination. Salps and jellyfish have inspired numerous robotic studies on the design of jet propulsion soft robots [6-12] and multi-robot coordination [13-17]. However, in the field of motion planning and control, most studies primarily consider undulatory locomotion by self-propulsion via body deformation [18-23], with only a few works involving underwater locomotion using jet propulsion [24-26]. This work was supported in part by ONR A ward N00014-23-1-2171. All the authors are with the Collaborative Robotics and Intelligent Systems (CoRIS) Institute at Oregon State University, Corvallis, OR USA. The units composing biological salps are called "zooids" (i.e., pseudoan-imals or not-quite-animals) because they exhibit many properties of animals but are not independent organisms from the colony. To discuss the general properties of multi-jet locomotion without making claims about the biological systems that inspire them, we use the terminology "chains" and "units" throughout this paper. The salp picture is reproduced from [27].
Simulator Ensembles for Trustworthy Autonomous Driving Testing
Sorokin, Lev, Biagiola, Matteo, Stocco, Andrea
Scenario-based testing with driving simulators is extensively used to identify failing conditions of automated driving assistance systems (ADAS) and reduce the amount of in-field road testing. However, existing studies have shown that repeated test execution in the same as well as in distinct simulators can yield different outcomes, which can be attributed to sources of flakiness or different implementations of the physics, among other factors. In this paper, we present MultiSim, a novel approach to multi-simulation ADAS testing based on a search-based testing approach that leverages an ensemble of simulators to identify failure-inducing, simulator-agnostic test scenarios. During the search, each scenario is evaluated jointly on multiple simulators. Scenarios that produce consistent results across simulators are prioritized for further exploration, while those that fail on only a subset of simulators are given less priority, as they may reflect simulator-specific issues rather than generalizable failures. Our case study, which involves testing a deep neural network-based ADAS on different pairs of three widely used simulators, demonstrates that MultiSim outperforms single-simulator testing by achieving on average a higher rate of simulator-agnostic failures by 51%. Compared to a state-of-the-art multi-simulator approach that combines the outcome of independent test generation campaigns obtained in different simulators, MultiSim identifies 54% more simulator-agnostic failing tests while showing a comparable validity rate. An enhancement of MultiSim that leverages surrogate models to predict simulator disagreements and bypass executions does not only increase the average number of valid failures but also improves efficiency in finding the first valid failure.
Revisiting Frank-Wolfe for Structured Nonconvex Optimization
Maskan, Hoomaan, Hou, Yikun, Sra, Suvrit, Yurtsever, Alp
We introduce a new projection-free (Frank-Wolfe) method for optimizing structured nonconvex functions that are expressed as a difference of two convex functions. This problem class subsumes smooth nonconvex minimization, positioning our method as a promising alternative to the classical Frank-Wolfe algorithm. DC decompositions are not unique; by carefully selecting a decomposition, we can better exploit the problem structure, improve computational efficiency, and adapt to the underlying problem geometry to find better local solutions. We prove that the proposed method achieves a first-order stationary point in $O(1/\epsilon^2)$ iterations, matching the complexity of the standard Frank-Wolfe algorithm for smooth nonconvex minimization in general. Specific decompositions can, for instance, yield a gradient-efficient variant that requires only $O(1/\epsilon)$ calls to the gradient oracle. Finally, we present numerical experiments demonstrating the effectiveness of the proposed method compared to the standard Frank-Wolfe algorithm.
Large Neighborhood Search and Bitmask Dynamic Programming for Wireless Mobile Charging Electric Vehicle Routing Problems in Medical Transportation
Zhao, Jingyi, Yang, Haoxiang, Liu, Yang
The transition to electric vehicles (EVs) is critical to achieving sustainable transportation, but challenges such as limited driving range and insufficient charging infrastructure have hindered the widespread adoption of EVs, especially in time-sensitive logistics such as medical transportation. This paper presents a new model to break through this barrier by combining wireless mobile charging technology with optimization. We propose the Wireless Mobile Charging Electric Vehicle Routing Problem (WMC-EVRP), which enables Medical Transportation Electric Vehicles (MTEVs) to be charged while traveling via Mobile Charging Carts (MCTs). This eliminates the time wastage of stopping for charging and ensures uninterrupted operation of MTEVs for such time-sensitive transportation problems. However, in this problem, the decisions of these two types of heterogeneous vehicles are coupled with each other, which greatly increases the difficulty of vehicle routing optimizations. To address this complex problem, we develop a mathematical model and a tailored meta-heuristic algorithm that combines Bit Mask Dynamic Programming (BDP) and Large Neighborhood Search (LNS). The BDP approach efficiently optimizes charging strategies, while the LNS framework utilizes custom operators to optimize the MTEV routes under capacity and synchronization constraints. Our approach outperforms traditional solvers in providing solutions for medium and large instances. Using actual hospital locations in Singapore as data, we validated the practical applicability of the model through extensive experiments and provided important insights into minimizing costs and ensuring the timely delivery of healthcare services.