Goto

Collaborating Authors

 cone program




Differentiable Convex Optimization Layers in Neural Architectures: Foundations and Perspectives

arXiv.org Artificial Intelligence

The integration of optimization problems within neural network architectures represents a fundamental shift from traditional approaches to handling constraints in deep learning. While it is long known that neural networks can incorporate soft constraints with techniques such as regularization, strict adherence to hard constraints is generally more difficult. A recent advance in this field, however, has addressed this problem by enabling the direct embedding of optimization layers as differentiable components within deep networks. This paper surveys the evolution and current state of this approach, from early implementations limited to quadratic programming, to more recent frameworks supporting general convex optimization problems. We provide a comprehensive review of the background, theoretical foundations, and emerging applications of this technology. Our analysis includes detailed mathematical proofs and an examination of various use cases that demonstrate the potential of this hybrid approach. This work synthesizes developments at the intersection of optimization theory and deep learning, offering insights into both current capabilities and future research directions in this rapidly evolving field.


Gradient boosting for convex cone predict and optimize problems

arXiv.org Artificial Intelligence

Recently there has been a growing body of research on decision-aware predictive modelling (see for example [5, 4, 15, 16, 18, 21, 25]). A traditional'predict, then optimize' framework treats the prediction estimation and decision optimization problem independently. As such, an'objective mismatch' [20] can occur whereby improved prediction accuracy does not result in improved decision accuracy. Conversely, the smart'predict, then optimize' (SPO) [15] framework optimizes prediction models in order to minimize the final downstream decision regret. To date, the SPO framework has been studied in a general setting for linear and decision tree regression models [15, 16]. In this paper we present dboost, a general purpose framework that combines the strength of gradient boosting with the SPO framework. Previous work [19] considers gradient boosting for integrated prediction and optimization problems but only considers a small subset of optimization problems with linear inequality constraints.


Bayesian Inference of Random Dot Product Graphs via Conic Programming

arXiv.org Machine Learning

We present a convex cone program to infer the latent probability matrix of a random dot product graph (RDPG). The optimization problem maximizes the Bernoulli maximum likelihood function with an added nuclear norm regularization term. The dual problem has a particularly nice form, related to the well-known semidefinite program relaxation of the maxcut problem. Using the primal-dual optimality conditions, we bound the entries and rank of the primal and dual solutions. Furthermore, we bound the optimal objective value and prove asymptotic consistency of the probability estimates of a slightly modified model under mild technical assumptions. Our experiments on synthetic RDPGs not only recover natural clusters, but also reveal the underlying low-dimensional geometry of the original data. We also demonstrate that the method recovers latent structure in the Karate Club Graph and synthetic U.S. Senate vote graphs and is scalable to graphs with up to a few hundred nodes.


Differentiable Convex Optimization Layers

arXiv.org Machine Learning

Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-specific languages (DSLs) for convex optimization. We introduce disciplined parametrized programming, a subset of disciplined convex programming, and we show that every disciplined parametrized program can be represented as the composition of an affine map from parameters to problem data, a solver, and an affine map from the solver's solution to a solution of the original problem (a new form we refer to as affine-solver-affine form). We then demonstrate how to efficiently differentiate through each of these components, allowing for end-to-end analytical differentiation through the entire convex program. We implement our methodology in version 1.1 of CVXPY, a popular Python-embedded DSL for convex optimization, and additionally implement differentiable layers for disciplined convex programs in PyTorch and TensorFlow 2.0. Our implementation significantly lowers the barrier to using convex optimization problems in differentiable programs. We present applications in linear machine learning models and in stochastic control, and we show that our layer is competitive (in execution time) compared to specialized differentiable solvers from past work.