Optimization
SAFE: Finding Sparse and Flat Minima to Improve Pruning
Lee, Dongyeop, Lee, Kwanhee, Chung, Jinseok, Lee, Namhoon
Sparsifying neural networks often suffers from seemingly inevitable performance degradation, and it remains challenging to restore the original performance despite much recent progress. Motivated by recent studies in robust optimization, we aim to tackle this problem by finding subnetworks that are both sparse and flat at the same time. Specifically, we formulate pruning as a sparsity-constrained optimization problem where flatness is encouraged as an objective. We solve it explicitly via an augmented Lagrange dual approach and extend it further by proposing a generalized projection operation, resulting in novel pruning methods called SAFE and its extension, SAFE$^+$. Extensive evaluations on standard image classification and language modeling tasks reveal that SAFE consistently yields sparse networks with improved generalization performance, which compares competitively to well-established baselines. In addition, SAFE demonstrates resilience to noisy data, making it well-suited for real-world conditions.
Rethinking DPO: The Role of Rejected Responses in Preference Misalignment
Cho, Jay Hyeon, Oh, JunHyeok, Kim, Myunsoo, Lee, Byung-Jun
Direct Preference Optimization (DPO) is a simple and efficient framework that has attracted substantial attention. However, it often struggles to meet its primary objectives -- increasing the generation probability of chosen responses while reducing that of rejected responses -- due to the dominant influence of rejected responses on the loss function. This imbalance leads to suboptimal performance in promoting preferred responses. In this work, we systematically analyze the limitations of DPO and existing algorithms designed to achieve the objectives stated above. To address these limitations, we propose Bounded-DPO (BDPO), a novel method that bounds the influence of rejected responses while maintaining the original optimization structure of DPO. Through theoretical analysis and empirical evaluations, we demonstrate that BDPO achieves a balanced optimization of the chosen and rejected responses, outperforming existing algorithms.
Constrained Diffusers for Safe Planning and Control
Zhang, Jichen, Zhao, Liqun, Papachristodoulou, Antonis, Umenberger, Jack
Diffusion models have shown remarkable potential in planning and control tasks due to their ability to represent multimodal distributions over actions and trajectories. However, ensuring safety under constraints remains a critical challenge for diffusion models. This paper proposes Constrained Diffusers, a novel framework that incorporates constraints into pre-trained diffusion models without retraining or architectural modifications. Inspired by constrained optimization, we apply a constrained Langevin sampling mechanism for the reverse diffusion process that jointly optimizes the trajectory and realizes constraint satisfaction through three iterative algorithms: projected method, primal-dual method and augmented Lagrangian approaches. In addition, we incorporate discrete control barrier functions as constraints for constrained diffusers to guarantee safety in online implementation. Experiments in Maze2D, locomotion, and pybullet ball running tasks demonstrate that our proposed methods achieve constraint satisfaction with less computation time, and are competitive to existing methods in environments with static and time-varying constraints.
Automated Heuristic Design for Unit Commitment Using Large Language Models
Lv, Junjin, Cui, Chenggang, Zhang, Shaodi, Chen, Hui, Gong, Chunyang, Liu, Jiaming
The Unit Commitment (UC) problem is a classic challenge in the optimal scheduling of power systems. Years of research and practice have shown that formulating reasonable unit commitment plans can significantly improve the economic efficiency of power systems' operations. In recent years, with the introduction of technologies such as machine learning and the Lagrangian relaxation method, the solution methods for the UC problem have become increasingly diversified, but still face challenges in terms of accuracy and robustness. This paper proposes a Function Space Search (FunSearch) method based on large language models. This method combines pre-trained large language models and evaluators to creatively generate solutions through the program search and evolution process while ensuring their rationality. In simulation experiments, a case of unit commitment with \(10\) units is used mainly. Compared to the genetic algorithm, the results show that FunSearch performs better in terms of sampling time, evaluation time, and total operating cost of the system, demonstrating its great potential as an effective tool for solving the UC problem.
Optimized Spectral Fault Receptive Fields for Diagnosis-Informed Prognosis
Gutiรฉrrez, Stan Muรฑoz, Wotawa, Franz
This paper introduces Spectral Fault Receptive Fields (SFRFs), a biologically inspired technique for degradation state assessment in bearing fault diagnosis and remaining useful life (RUL) estimation. Drawing on the center-surround organization of retinal ganglion cell receptive fields, we propose a frequency-domain feature extraction algorithm that enhances the detection of fault signatures in vibration signals. SFRFs are designed as antagonistic spectral filters centered on characteristic fault frequencies, with inhibitory surrounds that enable robust characterization of incipient faults under variable operating conditions. A multi-objective evolutionary optimization strategy based on NSGA-II algorithm is employed to tune the receptive field parameters by simultaneously minimizing RUL prediction error, maximizing feature monotonicity, and promoting smooth degradation trajectories. The method is demonstrated on the XJTU-SY bearing run-to-failure dataset, confirming its suitability for constructing condition indicators in health monitoring applications. Key contributions include: (i) the introduction of SFRFs, inspired by the biology of vision in the primate retina; (ii) an evolutionary optimization framework guided by condition monitoring and prognosis criteria; and (iii) experimental evidence supporting the detection of early-stage faults and their precursors. Furthermore, we confirm that our diagnosis-informed spectral representation achieves accurate RUL prediction using a bagging regressor. The results highlight the interpretability and principled design of SFRFs, bridging signal processing, biological sensing principles, and data-driven prognostics in rotating machinery.
Three-dimensional Deep Shape Optimization with a Limited Dataset
Generative models have attracted considerable attention for their ability to produce novel shapes. However, their application in mechanical design remains constrained due to the limited size and variability of available datasets. This study proposes a deep learning-based optimization framework specifically tailored for shape optimization with limited datasets, leveraging positional encoding and a Lipschitz regularization term to robustly learn geometric characteristics and maintain a meaningful latent space. Through extensive experiments, the proposed approach demonstrates robustness, generalizability and effectiveness in addressing typical limitations of conventional optimization frameworks. The validity of the methodology is confirmed through multi-objective shape optimization experiments conducted on diverse three-dimensional datasets, including wheels and cars, highlighting the model's versatility in producing practical and high-quality design outcomes even under data-constrained conditions.
Energy-Efficient Green AI Architectures for Circular Economies Through Multi-Layered Sustainable Resource Optimization Framework
In this research paper, we propose a new type of energy-efficient Green AI architecture to support circular economies and address the contemporary challenge of sustainable resource consumption in modern systems. We introduce a multi-layered framework and meta-architecture that integrates state-of-the-art machine learning algorithms, energy-conscious computational models, and optimization techniques to facilitate decision-making for resource reuse, waste reduction, and sustainable production.We tested the framework on real-world datasets from lithium-ion battery recycling and urban waste management systems, demonstrating its practical applicability. Notably, the key findings of this study indicate a 25 percent reduction in energy consumption during workflows compared to traditional methods and an 18 percent improvement in resource recovery efficiency. Quantitative optimization was based on mathematical models such as mixed-integer linear programming and lifecycle assessments. Moreover, AI algorithms improved classification accuracy on urban waste by 20 percent, while optimized logistics reduced transportation emissions by 30 percent. We present graphical analyses and visualizations of the developed framework, illustrating its impact on energy efficiency and sustainability as reflected in the simulation results. This paper combines the principles of Green AI with practical insights into how such architectural models contribute to circular economies, presenting a fully scalable and scientifically rooted solution aligned with applicable UN Sustainability Goals worldwide. These results open avenues for incorporating newly developed AI technologies into sustainable management strategies, potentially safeguarding local natural capital while advancing technological progress.
OSI Stack Redesign for Quantum Networks: Requirements, Technologies, Challenges, and Future Directions
Ahmed, Shakil, Saeed, Muhammad Kamran, Khokhar, Ashfaq
Quantum communication is poised to become a foundational element of next-generation networking, offering transformative capabilities in security, entanglement-based connectivity, and computational offloading. However, the classical OSI model-designed for deterministic and error-tolerant systems-cannot support quantum-specific phenomena such as coherence fragility, probabilistic entanglement, and the no-cloning theorem. This paper provides a comprehensive survey and proposes an architectural redesign of the OSI model for quantum networks in the context of 7G. We introduce a Quantum-Converged OSI stack by extending the classical model with Layer 0 (Quantum Substrate) and Layer 8 (Cognitive Intent), supporting entanglement, teleportation, and semantic orchestration via LLMs and QML. Each layer is redefined to incorporate quantum mechanisms such as enhanced MAC protocols, fidelity-aware routing, and twin-based applications. This survey consolidates over 150 research works from IEEE, ACM, MDPI, arXiv, and Web of Science (2018-2025), classifying them by OSI layer, enabling technologies such as QKD, QEC, PQC, and RIS, and use cases such as satellite QKD, UAV swarms, and quantum IoT. A taxonomy of cross-layer enablers-such as hybrid quantum-classical control, metadata-driven orchestration, and blockchain-integrated quantum trust-is provided, along with simulation tools including NetSquid, QuNetSim, and QuISP. We present several domain-specific applications, including quantum healthcare telemetry, entangled vehicular networks, and satellite mesh overlays. An evaluation framework is proposed based on entropy throughput, coherence latency, and entanglement fidelity. Key future directions include programmable quantum stacks, digital twins, and AI-defined QNet agents, laying the groundwork for a scalable, intelligent, and quantum-compliant OSI framework for 7G and beyond.
Latency Optimization for Wireless Federated Learning in Multihop Networks
Shaon, Shaba, Nguyen, Van-Dinh, Nguyen, Dinh C.
In this paper, we study a novel latency minimization problem in wireless federated learning (FL) across multi-hop networks. The system comprises multiple routes, each integrating leaf and relay nodes for FL model training. We explore a personalized learning and adaptive aggregation-aware FL (PAFL) framework that effectively addresses data heterogeneity across participating nodes by harmonizing individual and collective learning objectives. We formulate an optimization problem aimed at minimizing system latency through the joint optimization of leaf and relay nodes, as well as relay routing indicator. We also incorporate an additional energy harvesting scheme for the relay nodes to help with their relay tasks. This formulation presents a computationally demanding challenge, and thus we develop a simple yet efficient algorithm based on block coordinate descent and successive convex approximation (SCA) techniques. Simulation results illustrate the efficacy of our proposed joint optimization approach for leaf and relay nodes with relay routing indicator. We observe significant latency savings in the wireless multi-hop PAFL system, with reductions of up to 69.37% compared to schemes optimizing only one node type, traditional greedy algorithm, and scheme without relay routing indicator.
Optimization-Free Diffusion Model -- A Perturbation Theory Approach
Khoo, Yuehaw, Oster, Mathias, Peng, Yifan
Diffusion models have emerged as a powerful framework in generative modeling, typically relying on optimizing neural networks to estimate the score function via forward SDE simulations. In this work, we propose an alternative method that is both optimization-free and forward SDE-free. By expanding the score function in a sparse set of eigenbasis of the backward Kolmogorov operator associated with the diffusion process, we reformulate score estimation as the solution to a linear system, avoiding iterative optimization and time-dependent sample generation. We analyze the approximation error using perturbation theory and demonstrate the effectiveness of our method on high-dimensional Boltzmann distributions and real-world datasets.