differentiable optimization
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.51)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs
We propose a framework which makes it feasible to directly train deep neural networks with respect to popular families of task-specific non-decomposable performance measures such as AUC, multi-class AUC, $F$-measure and others. A common feature of the optimization model that emerges from these tasks is that it involves solving a Linear Programs (LP) during training where representations learned by upstream layers characterize the constraints or the feasible set. The constraint matrix is not only large but the constraints are also modified at each iteration. We show how adopting a set of ingenious ideas proposed by Mangasarian for 1-norm SVMs -- which advocates for solving LPs with a generalized Newton method -- provides a simple and effective solution that can be run on the GPU. In particular, this strategy needs little unrolling, which makes it more efficient during backward pass. Further, even when the constraint matrix is too large to fit on the GPU memory (say large minibatch settings), we show that running the Newton method in a lower dimensional space yields accurate gradients for training, by utilizing a statistical concept called {\em sufficient} dimension reduction. While a number of specialized algorithms have been proposed for the models that we describe here, our module turns out to be applicable without any specific adjustments or relaxations. We describe each use case, study its properties and demonstrate the efficacy of the approach over alternatives which use surrogate lower bounds and often, specialized optimization schemes. Frequently, we achieve superior computational behavior and performance improvements on common datasets used in the literature.
Differentiable Optimization for Deep Learning-Enhanced DC Approximation of AC Optimal Power Flow
Rosemberg, Andrew, Klamkin, Michael, Van Hentenryck, Pascal
The growing scale of power systems and the increasing uncertainty introduced by renewable energy sources necessitates novel optimization techniques that are significantly faster and more accurate than existing methods. The AC Optimal Power Flow (AC-OPF) problem, a core component of power grid optimization, is often approximated using linearized DC Optimal Power Flow (DC-OPF) models for computational tractability, albeit at the cost of suboptimal and inefficient decisions. To address these limitations, we propose a novel deep learning-based framework for network equivalency that enhances DC-OPF to more closely mimic the behavior of AC-OPF. The approach utilizes recent advances in differentiable optimization, incorporating a neural network trained to predict adjusted nodal shunt conductances and branch susceptances in order to account for nonlinear power flow behavior. The model can be trained end-to-end using modern deep learning frameworks by leveraging the implicit function theorem. Results demonstrate the framework's ability to significantly improve prediction accuracy.
- North America > United States > Georgia > Fulton County > Atlanta (0.05)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)
- Energy > Power Industry (1.00)
- Government > Regional Government > North America Government > United States Government (0.47)
A General and Streamlined Differentiable Optimization Framework
Rosemberg, Andrew W., Garcia, Joaquim Dias, Pacaud, François, Parker, Robert B., Legat, Benoît, Sundar, Kaarthik, Bent, Russell, Van Hentenryck, Pascal
Differentiating through constrained optimization problems is increasingly central to learning, control, and large-scale decision-making systems, yet practical integration remains challenging due to solver specialization and interface mismatches. This paper presents a general and streamlined framework-an updated DiffOpt.jl-that unifies modeling and differentiation within the Julia optimization stack. The framework computes forward - and reverse-mode solution and objective sensitivities for smooth, potentially nonconvex programs by differentiating the KKT system under standard regularity assumptions. A first-class, JuMP-native parameter-centric API allows users to declare named parameters and obtain derivatives directly with respect to them - even when a parameter appears in multiple constraints and objectives - eliminating brittle bookkeeping from coefficient-level interfaces. We illustrate these capabilities on convex and nonconvex models, including economic dispatch, mean-variance portfolio selection with conic risk constraints, and nonlinear robot inverse kinematics. Two companion studies further demonstrate impact at scale: gradient-based iterative methods for strategic bidding in energy markets and Sobolev-style training of end-to-end optimization proxies using solver-accurate sensitivities. Together, these results demonstrate that differentiable optimization can be deployed as a routine tool for experimentation, learning, calibration, and design-without deviating from standard JuMP modeling practices and while retaining access to a broad ecosystem of solvers.
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)
Parameter Identification of a Differentiable Human Arm Musculoskeletal Model without Deep Muscle EMG Reconstruction
Sanderink, Philip, Zhou, Yingfan, Luo, Shuzhen, Fang, Cheng
Accurate parameter identification of a subject-specific human musculoskeletal model is crucial to the development of safe and reliable physically collaborative robotic systems, for instance, assistive exoskeletons. Electromyography (EMG)-based parameter identification methods have demonstrated promising performance for personalized musculoskeletal modeling, whereas their applicability is limited by the difficulty of measuring deep muscle EMGs invasively. Although several strategies have been proposed to reconstruct deep muscle EMGs or activations for parameter identification, their reliability and robustness are limited by assumptions about the deep muscle behavior. In this work, we proposed an approach to simultaneously identify the bone and superficial muscle parameters of a human arm musculoskeletal model without reconstructing the deep muscle EMGs. This is achieved by only using the least-squares solution of the deep muscle forces to calculate a loss gradient with respect to the model parameters for identifying them in a framework of differentiable optimization. The results of extensive comparative simulations manifested that our proposed method can achieve comparable estimation accuracy compared to a similar method, but with all the muscle EMGs available.
Minimizing Surrogate Losses for Decision-Focused Learning using Differentiable Optimization
Mandi, Jayanta, Mahmutoğulları, Ali İrfan, Berden, Senne, Guns, Tias
Decision-focused learning (DFL) trains a machine learning (ML) model to predict parameters of an optimization problem, to directly minimize decision regret, i.e., maximize decision quality. Gradient-based DFL requires computing the derivative of the solution to the optimization problem with respect to the predicted parameters. However, for many optimization problems, such as linear programs (LPs), the gradient of the regret with respect to the predicted parameters is zero almost everywhere. Existing gradient-based DFL approaches for LPs try to circumvent this issue in one of two ways: (a) smoothing the LP into a differentiable optimization problem by adding a quadratic regularizer and then minimizing the regret directly or (b) minimizing surrogate losses that have informative (sub)gradients. In this paper, we show that the former approach still results in zero gradients, because even after smoothing the regret remains constant across large regions of the parameter space. To address this, we propose minimizing surrogate losses -- even when a differentiable optimization layer is used and regret can be minimized directly. Our experiments demonstrate that minimizing surrogate losses allows differentiable optimization layers to achieve regret comparable to or better than surrogate-loss based DFL methods. Further, we demonstrate that this also holds for DYS-Net, a recently proposed differentiable optimization technique for LPs, that computes approximate solutions and gradients through operations that can be performed using feedforward neural network layers. Because DYS-Net executes the forward and the backward pass very efficiently, by minimizing surrogate losses using DYS-Net, we are able to attain regret on par with the state-of-the-art while reducing training time by a significant margin.
- North America > United States (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Sweden > Uppsala County > Uppsala (0.04)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Diff-MSM: Differentiable MusculoSkeletal Model for Simultaneous Identification of Human Muscle and Bone Parameters
Zhou, Yingfan, Sanderink, Philip, Lemming, Sigurd Jager, Fang, Cheng
High-fidelity personalized human musculoskeletal models are crucial for simulating realistic behavior of physically coupled human-robot interactive systems and verifying their safety-critical applications in simulations before actual deployment, such as human-robot co-transportation and rehabilitation through robotic exoskeletons. Identifying subject-specific Hill-type muscle model parameters and bone dynamic parameters is essential for a personalized musculoskeletal model, but very challenging due to the difficulty of measuring the internal biomechanical variables in vivo directly, especially the joint torques. In this paper, we propose using Differentiable MusculoSkeletal Model (Diff-MSM) to simultaneously identify its muscle and bone parameters with an end-to-end automatic differentiation technique differentiating from the measurable muscle activation, through the joint torque, to the resulting observable motion without the need to measure the internal joint torques. Through extensive comparative simulations, the results manifested that our proposed method significantly outperformed the state-of-the-art baseline methods, especially in terms of accurate estimation of the muscle parameters (i.e., initial guess sampled from a normal distribution with the mean being the ground truth and the standard deviation being 10% of the ground truth could end up with an average of the percentage errors of the estimated values as low as 0.05%). In addition to human musculoskeletal modeling and simulation, the new parameter identification technique with the Diff-MSM has great potential to enable new applications in muscle health monitoring, rehabilitation, and sports science.
- South America > Uruguay > Artigas > Artigas (0.04)
- North America > United States (0.04)
- Europe > Denmark > Southern Denmark (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)
- Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.55)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
DOMAC: Differentiable Optimization for High-Speed Multipliers and Multiply-Accumulators
Xue, Chenhao, Ren, Yi, Zhou, Jinwei, Li, Kezhi, Zhang, Chen, Lin, Yibo, Zhang, Lining, Xu, Qiang, Sun, Guangyu
Multipliers and multiply-accumulators (MACs) are fundamental building blocks for compute-intensive applications such as artificial intelligence. With the diminishing returns of Moore's Law, optimizing multiplier performance now necessitates process-aware architectural innovations rather than relying solely on technology scaling. In this paper, we introduce DOMAC, a novel approach that employs differentiable optimization for designing multipliers and MACs at specific technology nodes. DOMAC establishes an analogy between optimizing multi-staged parallel compressor trees and training deep neural networks. Building on this insight, DOMAC reformulates the discrete optimization challenge into a continuous problem by incorporating differentiable timing and area objectives. This formulation enables us to utilize existing deep learning toolkit for highly efficient implementation of the differentiable solver. Experimental results demonstrate that DOMAC achieves significant enhancements in both performance and area efficiency compared to state-of-the-art baselines and commercial IPs in multiplier and MAC designs.
Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs
We propose a framework which makes it feasible to directly train deep neural networks with respect to popular families of task-specific non-decomposable performance measures such as AUC, multi-class AUC, F -measure and others. A common feature of the optimization model that emerges from these tasks is that it involves solving a Linear Programs (LP) during training where representations learned by upstream layers characterize the constraints or the feasible set. The constraint matrix is not only large but the constraints are also modified at each iteration. We show how adopting a set of ingenious ideas proposed by Mangasarian for 1-norm SVMs -- which advocates for solving LPs with a generalized Newton method -- provides a simple and effective solution that can be run on the GPU. In particular, this strategy needs little unrolling, which makes it more efficient during backward pass.
Learning Fair Ranking Policies via Differentiable Optimization of Ordered Weighted Averages
Dinh, My H., Kotary, James, Fioretto, Ferdinando
Learning to Rank (LTR) is one of the most widely used machine learning applications. It is a key component in platforms with profound societal impacts, including job search, healthcare information retrieval, and social media content feeds. Conventional LTR models have been shown to produce biases results, stimulating a discourse on how to address the disparities introduced by ranking systems that solely prioritize user relevance. However, while several models of fair learning to rank have been proposed, they suffer from deficiencies either in accuracy or efficiency, thus limiting their applicability to real-world ranking platforms. This paper shows how efficiently-solvable fair ranking models, based on the optimization of Ordered Weighted Average (OWA) functions, can be integrated into the training loop of an LTR model to achieve favorable balances between fairness, user utility, and runtime efficiency. In particular, this paper is the first to show how to backpropagate through constrained optimizations of OWA objectives, enabling their use in integrated prediction and decision models.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Umbria > Perugia Province > Perugia (0.04)