Optimization
Computational Design and Fabrication of Modular Robots with Untethered Control
Bhargava, Manas, Hiraki, Takefumi, Strugaru, Malina, Zhang, Yuhan, Piovarci, Michal, Daraio, Chiara, Iwai, Daisuke, Bickel, Bernd
Natural organisms utilize distributed actuation through their musculoskeletal systems to adapt their gait for traversing diverse terrains or to morph their bodies for varied tasks. A longstanding challenge in robotics is to emulate this capability of natural organisms, which has motivated the development of numerous soft robotic systems. However, such systems are generally optimized for a single functionality, lack the ability to change form or function on demand, or remain tethered to bulky control systems. To address these limitations, we present a framework for designing and controlling robots that utilize distributed actuation. We propose a novel building block that integrates 3D-printed bones with liquid crystal elastomer (LCE) muscles as lightweight actuators, enabling the modular assembly of musculoskeletal robots. We developed LCE rods that contract in response to infrared radiation, thereby providing localized, untethered control over the distributed skeletal network and producing global deformations of the robot. To fully capitalize on the extensive design space, we introduce two computational tools: one for optimizing the robot's skeletal graph to achieve multiple target deformations, and another for co-optimizing skeletal designs and control gaits to realize desired locomotion. We validate our framework by constructing several robots that demonstrate complex shape morphing, diverse control schemes, and environmental adaptability. Our system integrates advances in modular material building, untethered and distributed control, and computational design to introduce a new generation of robots that brings us closer to the capabilities of living organisms.
Speeding Up Hyper-Heuristics With Markov-Chain Operator Selection and the Only-Worsening Acceptance Operator
Bendahi, Abderrahim, Doerr, Benjamin, Fradin, Adrien, Lutzeyer, Johannes F.
The move-acceptance hyper-heuristic was recently shown to be able to leave local optima with astonishing efficiency (Lissovoi et al., Artificial Intelligence (2023)). In this work, we propose two modifications to this algorithm that demonstrate impressive performances on a large class of benchmarks including the classic Cliff$_d$ and Jump$_m$ function classes. (i) Instead of randomly choosing between the only-improving and any-move acceptance operator, we take this choice via a simple two-state Markov chain. This modification alone reduces the runtime on Jump$_m$ functions with gap parameter $m$ from $ฮฉ(n^{2m-1})$ to $O(n^{m+1})$. (ii) We then replace the all-moves acceptance operator with the operator that only accepts worsenings. Such a, counter-intuitive, operator has not been used before in the literature. However, our proofs show that our only-worsening operator can greatly help in leaving local optima, reducing, e.g., the runtime on Jump functions to $O(n^3 \log n)$ independent of the gap size. In general, we prove a remarkably good runtime of $O(n^{k+1} \log n)$ for our Markov move-acceptance hyper-heuristic on all members of a new benchmark class SEQOPT$_k$, which contains a large number of functions having $k$ successive local optima, and which contains the commonly studied Jump$_m$ and Cliff$_d$ functions for $k=2$.
C-Flat++: Towards a More Efficient and Powerful Framework for Continual Learning
Li, Wei, Yuan, Hangjie, Zhao, Zixiang, Zhu, Yifan, Lu, Aojun, Feng, Tao, Sun, Yanan
Balancing sensitivity to new tasks and stability for retaining past knowledge is crucial in continual learning (CL). Recently, sharpness-aware minimization has proven effective in transfer learning and has also been adopted in continual learning (CL) to improve memory retention and learning efficiency. However, relying on zeroth-order sharpness alone may favor sharper minima over flatter ones in certain settings, leading to less robust and potentially suboptimal solutions. In this paper, we propose \textbf{C}ontinual \textbf{Flat}ness (\textbf{C-Flat}), a method that promotes flatter loss landscapes tailored for CL. C-Flat offers plug-and-play compatibility, enabling easy integration with minimal modifications to the code pipeline. Besides, we present a general framework that integrates C-Flat into all major CL paradigms and conduct comprehensive comparisons with loss-minima optimizers and flat-minima-based CL methods. Our results show that C-Flat consistently improves performance across a wide range of settings. In addition, we introduce C-Flat++, an efficient yet effective framework that leverages selective flatness-driven promotion, significantly reducing the update cost required by C-Flat. Extensive experiments across multiple CL methods, datasets, and scenarios demonstrate the effectiveness and efficiency of our proposed approaches. Code is available at https://github.com/WanNaa/C-Flat.
OptMark: Robust Multi-bit Diffusion Watermarking via Inference Time Optimization
Xing, Jiazheng, Ci, Hai, Xu, Hongbin, Yuan, Hangjie, Liu, Yong, Shou, Mike Zheng
However, current diffusion watermarking methods face significant limitations: zero-bit watermarking systems lack the capacity for large-scale user tracking, while multi-bit methods are highly sensitive to certain image transformations or generative attacks, resulting in a lack of comprehensive robustness. In this paper, we propose OptMark, an optimization-based approach that embeds a robust multi-bit watermark into the intermediate latents of the diffusion denoising process. OptMark strategically inserts a structural watermark early to resist generative attacks and a detail watermark late to withstand image transformations, with tailored regularization terms to preserve image quality and ensure imperceptibility. To address the challenge of memory consumption growing linearly with the number of denoising steps during optimization, OptMark incorporates adjoint gradient methods, reducing memory usage from O ( N) to O (1). Experimental results demonstrate that Opt-Mark achieves invisible multi-bit watermarking while ensuring robust resilience against valuemetric transformations, geometric transformations, editing, and regeneration attacks.
Integrating Large Language Models with Network Optimization for Interactive and Explainable Supply Chain Planning: A Real-World Case Study
This paper presents an integrated framework that combines traditional network optimization models with large language models (LLMs) to deliver interactive, explainable, and role-aware decision support for supply chain planning. The proposed system bridges the gap between complex operations research outputs and business stakeholder understanding by generating natural language summaries, contextual visualizations, and tailored key performance indicators (KPIs). The core optimization model addresses tactical inventory redistribution across a network of distribution centers for multi-period and multi-item, using a mixed-integer formulation. The technical architecture incorporates AI agents, RESTful APIs, and a dynamic user interface to support real-time interaction, configuration updates, and simulation-based insights. A case study demonstrates how the system improves planning outcomes by preventing stockouts, reducing costs, and maintaining service levels. Future extensions include integrating private LLMs, transfer learning, reinforcement learning, and Bayesian neural networks to enhance explainability, adaptability, and real-time decision-making.
PMODE: Theoretically Grounded and Modular Mixture Modeling
We introduce PMODE (Partitioned Mixture Of Density Estimators), a general and modular framework for mixture modeling with both parametric and nonparametric components. PMODE builds mixtures by partitioning the data and fitting separate estimators to each subset. It attains near-optimal rates for this estimator class and remains valid even when the mixture components come from different distribution families. As an application, we develop MV-PMODE, which scales a previously theoretical approach to high-dimensional density estimation to settings with thousands of dimensions. Despite its simplicity, it performs competitively against deep baselines on CIFAR-10 anomaly detection.
FUTURE: Flexible Unlearning for Tree Ensemble
Chen, Ziheng, Huang, Jin, Cheng, Jiali, Guo, Yuchan, Wang, Mengjie, Morishetti, Lalitesh, Nag, Kaushiki, Amiri, Hadi
Tree ensembles are widely recognized for their effectiveness in classification tasks, achieving state-of-the-art performance across diverse domains, including bioinformatics, finance, and medical diagnosis. With increasing emphasis on data privacy and the \textit{right to be forgotten}, several unlearning algorithms have been proposed to enable tree ensembles to forget sensitive information. However, existing methods are often tailored to a particular model or rely on the discrete tree structure, making them difficult to generalize to complex ensembles and inefficient for large-scale datasets. To address these limitations, we propose FUTURE, a novel unlearning algorithm for tree ensembles. Specifically, we formulate the problem of forgetting samples as a gradient-based optimization task. In order to accommodate non-differentiability of tree ensembles, we adopt the probabilistic model approximations within the optimization framework. This enables end-to-end unlearning in an effective and efficient manner. Extensive experiments on real-world datasets show that FUTURE yields significant and successful unlearning performance.
Data-Driven Bifurcation Handling in Physics-Based Reduced-Order Vascular Hemodynamic Models
Rubio, Natalia L., Darve, Eric F., Marsden, Alison L.
Three-dimensional (3D) finite-element simulations of cardiovascular flows provide high-fidelity predictions to support cardiovascular medicine, but their high computational cost limits clinical practicality. Reduced-order models (ROMs) offer computationally efficient alternatives but suffer reduced accuracy, particularly at vessel bifurcations where complex flow physics are inadequately captured by standard Poiseuille flow assumptions. We present an enhanced numerical framework that integrates machine learning-predicted bifurcation coefficients into zero-dimensional (0D) hemodynamic ROMs to improve accuracy while maintaining computational efficiency. We develop a resistor-resistor-inductor (RRI) model that uses neural networks to predict pressure-flow relationships from bifurcation geometry, incorporating linear and quadratic resistances along with inductive effects. The method employs non-dimensionalization to reduce training data requirements and apriori flow split prediction for improved bifurcation characterization. We incorporate the RRI model into a 0D model using an optimization-based solution strategy. We validate the approach in isolated bifurcations and vascular trees, across Reynolds numbers from 0 to 5,500, defining ROM accuracy by comparison to 3D finite element simulation. Results demonstrate substantial accuracy improvements: averaged across all trees and Reynolds numbers, the RRI method reduces inlet pressure errors from 54 mmHg (45%) for standard 0D models to 25 mmHg (17%), while a simplified resistor-inductor (RI) variant achieves 31 mmHg (26%) error. The enhanced 0D models show particular effectiveness at high Reynolds numbers and in extensive vascular networks. This hybrid numerical approach enables accurate, real-time hemodynamic modeling for clinical decision support, uncertainty quantification, and digital twins in cardiovascular biomedical engineering.
Distributed optimization: designed for federated learning
Guo, Wenyou, Qu, Ting, Pan, Chunrong, Huang, George Q.
--Federated Learning (FL), as a distributed collaborative Machine Learning (ML) framework under privacy-preserving constraints, has garnered increasing research attention in cross-organizational data collaboration scenarios. This paper proposes a class of distributed optimization algorithms based on the augmented Lagrangian technique, designed to accommodate diverse communication topologies in both centralized and decentralized FL settings. Furthermore, we develop multiple termination criteria and parameter update mechanisms to enhance computational efficiency, accompanied by rigorous theoretical guarantees of convergence. By generalizing the augmented Lagrangian relaxation through the incorporation of proximal relaxation and quadratic approximation, our framework systematically recovers a broad of classical unconstrained optimization methods, including proximal algorithm, classic gradient descent, and stochastic gradient descent, among others. Notably, the convergence properties of these methods can be naturally derived within the proposed theoretical framework. Numerical experiments demonstrate that the proposed algorithm exhibits strong performance in large-scale settings with significant statistical heterogeneity across clients. Such formulations, commonly referred to as consensus optimization problems, find widespread applications in interdisciplinary domains including distributed ML, collaborative sensing in sensor networks, and distributed parameter estimation [1]. This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 52375498, and in part by the Fundamental Research Funds for the Central Universities under Grant 21623111. Ting Qu is with Guangdong International Cooperation Base of Science and Technology for GBA Smart Logistics, Jinan University, Zhuhai 519070, China, also with School of Intelligent Systems Science and Engineering, Jinan University, Zhuhai 519070, China, and also with Institute of Physical Internet, Jinan University, Zhuhai 519070, China (e-mail: quting@jnu.edu.cn).
CoFormer: Collaborating with Heterogeneous Edge Devices for Scalable Transformer Inference
Xu, Guanyu, Hao, Zhiwei, Shen, Li, Luo, Yong, Sun, Fuhui, Wang, Xiaoyan, Hu, Han, Wen, Yonggang
--The impressive performance of transformer models has sparked the deployment of intelligent applications on resource-constrained edge devices. However, ensuring high-quality service for real-time edge systems is a significant challenge due to the considerable computational demands and resource requirements of these models. Existing strategies typically either offload transformer computations to other devices or directly deploy compressed models on individual edge devices. T o tackle these challenges, we propose a collaborative inference system for general transformer models, termed CoFormer . The central idea behind CoFormer is to exploit the divisibility and integrability of transformer . An off-the-shelf large transformer can be decomposed into multiple smaller models for distributed inference, and their intermediate results are aggregated to generate the final output. We formulate an optimization problem to minimize both inference latency and accuracy degradation under heterogeneous hardware constraints. DeBo algorithm is proposed to first solve the optimization problem to derive the decomposition policy, and then progressively calibrate decomposed models to restore performance. We demonstrate the capability to support a wide range of transformer models on heterogeneous edge devices, achieving up to 3.1 inference speedup with large transformer models. Notably, CoFormer enables the efficient inference of GPT2-XL with 1.6 billion parameters on edge devices, reducing memory requirements by 76.3%. CoFormer can also reduce energy consumption by approximately 40% while maintaining satisfactory inference performance. Guanyu Xu, Zhiwei Hao and Han Hu are with the School of Information and Electrionics, Beijing Institute of Technology, Beijing 100081, China. Li Shen is with the School of Cyber Science and Technology, Shen-zhen Campus of Sun Y at-sen University, Shenzhen 518107, China. Y ong Luo is with the School of Computer Science, National Engineering Research Center for Multimedia Software, Wuhan University, Wuhan 430072, China. Fuhui Sun and Xiaoyan Wang are with Information Technology Service Center of People's Court, Beijing, 100745, China. Y onggang Wen is with the College of Computing and Data Science, Nanyang Technological University, Singapore 639798. CoFormer significantly outperforms other methods. Specifically, CoFormer accelerates inference speed by 3.1 compared to Swin-L [4] with only 1.7% accuracy sacrifice.