AITopics

2501.02942

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

arXiv.org Machine LearningOct-31-2023

Sparse PCA With Multiple Components

Cory-Wright, Ryan, Pauphilet, Jean

Sparse Principal Component Analysis (sPCA) is a cardinal technique for obtaining combinations of features, or principal components (PCs), that explain the variance of high-dimensional datasets in an interpretable manner. This involves solving a sparsity and orthogonality constrained convex maximization problem, which is extremely computationally challenging. Most existing works address sparse PCA via methods-such as iteratively computing one sparse PC and deflating the covariance matrix-that do not guarantee the orthogonality, let alone the optimality, of the resulting solution when we seek multiple mutually orthogonal PCs. We challenge this status by reformulating the orthogonality conditions as rank constraints and optimizing over the sparsity and rank constraints simultaneously. We design tight semidefinite relaxations to supply high-quality upper bounds, which we strengthen via additional second-order cone inequalities when each PC's individual sparsity is specified. Further, we derive a combinatorial upper bound on the maximum amount of variance explained as a function of the support. We exploit these relaxations and bounds to propose exact methods and rounding mechanisms that, together, obtain solutions with a bound gap on the order of 0%-15% for real-world datasets with p = 100s or 1000s of features and r \in {2, 3} components. Numerically, our algorithms match (and sometimes surpass) the best performing methods in terms of fraction of variance explained and systematically return PCs that are sparse and orthogonal. In contrast, we find that existing methods like deflation return solutions that violate the orthogonality constraints, even when the data is generated according to sparse orthogonal PCs. Altogether, our approach solves sparse PCA problems with multiple components to certifiable (near) optimality in a practically tractable fashion.

artificial intelligence, machine learning, optimization problem, (16 more...)

2209.1479

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)

arXiv.org Artificial IntelligenceSep-23-2023

AI Hilbert: A New Paradigm for Scientific Discovery by Unifying Data and Background Knowledge

Cory-Wright, Ryan, Khadir, Bachir El, Cornelio, Cristina, Dash, Sanjeeb, Horesh, Lior

The discovery of scientific formulae that parsimoniously explain natural phenomena and align with existing background theory is a key goal in science. Historically, scientists have derived natural laws by manipulating equations based on existing knowledge, forming new equations, and verifying them experimentally. In recent years, data-driven scientific discovery has emerged as a viable competitor in settings with large amounts of experimental data. Unfortunately, data-driven methods often fail to discover valid laws when data is noisy or scarce. Accordingly, recent works combine regression and reasoning to eliminate formulae inconsistent with background theory. However, the problem of searching over the space of formulae consistent with background theory to find one that fits the data best is not well-solved. We propose a solution to this problem when all axioms and scientific laws are expressible via polynomial equalities and inequalities and argue that our approach is widely applicable. We further model notions of minimal complexity using binary variables and logical constraints, solve polynomial optimization problems via mixed-integer linear or semidefinite optimization, and prove the validity of our scientific discoveries in a principled manner using Positivestellensatz certificates. Remarkably, the optimization techniques leveraged in this paper allow our approach to run in polynomial time with fully correct background theory, or non-deterministic polynomial (NP) time with partially correct background theory. We demonstrate that some famous scientific laws, including Kepler's Third Law of Planetary Motion, the Hagen-Poiseuille Equation, and the Radiated Gravitational Wave Power equation, can be derived in a principled manner from background axioms and experimental data.

artificial intelligence, background theory, machine learning, (16 more...)

2308.09474

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.14)
Europe > Germany > Berlin (0.14)

Genre: Research Report (1.00)

Industry:

Government (0.93)
Law (0.68)
Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceJun-26-2023

Gain Confidence, Reduce Disappointment: A New Approach to Cross-Validation for Sparse Regression

Cory-Wright, Ryan, Gómez, Andrés

Ridge regularized sparse regression involves selecting a subset of features that explains the relationship between a design matrix and an output vector in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like leave-one-out cross-validation are commonly used for hyperparameter tuning. However, cross-validation typically increases the cost of sparse regression by several orders of magnitude. Additionally, validation metrics are noisy estimators of the test-set error, with different hyperparameter combinations giving models with different amounts of noise. Therefore, optimizing over these metrics is vulnerable to out-of-sample disappointment, especially in underdetermined settings. To address this, we make two contributions. First, we leverage the generalization theory literature to propose confidence-adjusted variants of leave-one-out that display less propensity to out-of-sample disappointment. Second, we leverage ideas from the mixed-integer literature to obtain computationally tractable relaxations of confidence-adjusted leave-one-out, thereby minimizing it without solving as many MIOs. Our relaxations give rise to an efficient coordinate descent scheme which allows us to obtain significantly lower leave-one-out errors than via other methods in the literature. We validate our theory by demonstrating we obtain significantly sparser and comparably accurate solutions than via popular methods like GLMNet and suffer from less out-of-sample disappointment. On synthetic datasets, our confidence adjustment procedure generates significantly fewer false discoveries, and improves out-of-sample performance by 2-5% compared to cross-validating without confidence adjustment. Across a suite of 13 real datasets, a calibrated version of our procedure improves the test set error by an average of 4% compared to cross-validating without confidence adjustment.

artificial intelligence, machine learning, relaxation, (18 more...)

2306.14851

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.63)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.84)

arXiv.org Artificial IntelligenceMay-20-2023

Optimal Low-Rank Matrix Completion: Semidefinite Relaxations and Eigenvector Disjunctions

Bertsimas, Dimitris, Cory-Wright, Ryan, Lo, Sean, Pauphilet, Jean

Low-rank matrix completion consists of computing a matrix of minimal complexity that recovers a given set of observations as accurately as possible, and has numerous applications such as product recommendation. Unfortunately, existing methods for solving low-rank matrix completion are heuristics that, while highly scalable and often identifying high-quality solutions, do not possess any optimality guarantees. We reexamine matrix completion with an optimality-oriented eye, by reformulating low-rank problems as convex problems over the non-convex set of projection matrices and implementing a disjunctive branch-and-bound scheme that solves them to certifiable optimality. Further, we derive a novel and often tight class of convex relaxations by decomposing a low-rank matrix as a sum of rank-one matrices and incentivizing, via a Shor relaxation, that each two-by-two minor in each rank-one matrix has determinant zero. In numerical experiments, our new convex relaxations decrease the optimality gap by two orders of magnitude compared to existing attempts. Moreover, we showcase the performance of our disjunctive branch-and-bound scheme and demonstrate that it solves matrix completion problems over 150x150 matrices to certifiable optimality in hours, constituting an order of magnitude improvement on the state-of-the-art for certifiably optimal methods.

artificial intelligence, machine learning, relaxation, (15 more...)

2305.12292

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningSep-26-2021

Sparse Plus Low Rank Matrix Decomposition: A Discrete Optimization Approach

Bertsimas, Dimitris, Cory-Wright, Ryan, Johnson, Nicholas A. G.

We study the Sparse Plus Low Rank decomposition problem (SLR), which is the problem of decomposing a corrupted data matrix $\mathbf{D}$ into a sparse matrix $\mathbf{Y}$ containing the perturbations plus a low rank matrix $\mathbf{X}$. SLR is a fundamental problem in Operations Research and Machine Learning arising in many applications such as data compression, latent semantic indexing, collaborative filtering and medical imaging. We introduce a novel formulation for SLR that directly models the underlying discreteness of the problem. For this formulation, we develop an alternating minimization heuristic to compute high quality solutions and a novel semidefinite relaxation that provides meaningful bounds for the solutions returned by our heuristic. We further develop a custom branch and bound routine that leverages our heuristic and convex relaxation that solves small instances of SLR to certifiable near-optimality. Our heuristic can scale to $n=10000$ in hours, our relaxation can scale to $n=200$ in hours, and our branch and bound algorithm can scale to $n=25$ in minutes. Our numerical results demonstrate that our approach outperforms existing state-of-the-art approaches in terms of the MSE of the low rank matrix and that of the sparse matrix.

artificial intelligence, health & medicine, survey article, (14 more...)

2109.12701

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre:

Overview (0.87)
Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Industry: Health & Medicine (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

arXiv.org Machine LearningMay-12-2021

A new perspective on low-rank optimization

Bertsimas, Dimitris, Cory-Wright, Ryan, Pauphilet, Jean

A key question in many low-rank problems throughout optimization, machine learning, and statistics is to characterize the convex hulls of simple low-rank sets and judiciously apply these convex hulls to obtain strong yet computationally tractable convex relaxations. We invoke the matrix perspective function - the matrix analog of the perspective function-and characterize explicitly the convex hull of epigraphs of convex quadratic, matrix exponential, and matrix power functions under low-rank constraints. Further, we exploit these characterizations to develop strong relaxations for a variety of low-rank problems including reduced rank regression, non-negative matrix factorization, and factor analysis. We establish that these relaxations can be modeled via semidefinite and matrix power cone constraints, and thus optimized over tractably. The proposed approach parallels and generalizes the perspective reformulation technique in mixed-integer optimization, and leads to new relaxations for a broad class of problems.

optimization problem, relaxation, survey article, (18 more...)

2105.05947

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

arXiv.org Machine LearningSep-22-2020

Mixed-Projection Conic Optimization: A New Paradigm for Modeling Rank Constraints

Bertsimas, Dimitris, Cory-Wright, Ryan, Pauphilet, Jean

We propose a framework for modeling and solving low-rank optimization problems to certifiable optimality. We introduce symmetric projection matrices that satisfy $Y^2=Y$, the matrix analog of binary variables that satisfy $z^2=z$, to model rank constraints. By leveraging regularization and strong duality, we prove that this modeling paradigm yields tractable convex optimization problems over the non-convex set of orthogonal projection matrices. Furthermore, we design outer-approximation algorithms to solve low-rank problems to certifiable optimality, compute lower bounds via their semidefinite relaxations, and provide near optimal solutions through rounding and local search techniques. We implement these numerical ingredients and, for the first time, solve low-rank optimization problems to certifiable optimality. Our algorithms also supply certifiably near-optimal solutions for larger problem sizes and outperform existing heuristics, by deriving an alternative to the popular nuclear norm relaxation which generalizes the perspective relaxation from vectors to matrices. All in all, our framework, which we name Mixed-Projection Conic Optimization, solves low-rank problems to certifiable optimality in a tractable and unified fashion.

null, optimization problem, survey article, (17 more...)

2009.10395

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

arXiv.org Machine LearningOct-7-2019

On Polyhedral and Second-Order-Cone Decompositions of Semidefinite Optimization Problems

Bertsimas, Dimitris, Cory-Wright, Ryan

However, it is notoriously di fficult to solve in practice, because IPMs memory requirements scale at a demanding rate. Indeed, state-of-the-art SDO solvers such as MOSEK cannot solve constrained instances of Problem (1) with n 250 variables on a standard laptop, and it is optimization folklore that there is a gap between SDOs theoretical and practical tractability. Motivated by the demanding memory requirements of IPMs, a stream of literature studies inexact methods for SDOs, which replace the semidefinite constraint with weaker yet less computationally demanding constraints. This approach was first investigated by Kim and Kojima [13], who observed that relaxing a positive semidefinite constraint to the weaker constraint that all 2 2 minors of a matrix are positive semidefinite yields a second order cone (SOC)-representable outer approximation of the positive semidefinite (PSD) cone. In a related line of work, Krishnan and Mitchell [15] propose applying Kelley [12]'s cutting plane method to generate

approximation, artificial intelligence, optimization problem, (17 more...)

1910.03143

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

arXiv.org Machine LearningJul-3-2019

A unified approach to mixed-integer optimization: Nonlinear formulations and scalable algorithms

Bertsimas, Dimitris, Cory-Wright, Ryan, Pauphilet, Jean

We propose a unified framework to address a family of classical mixed-integer optimization problems, including network design, facility location, unit commitment, sparse portfolio selection, binary quadratic optimization and sparse learning problems. These problems exhibit logical relationships between continuous and discrete variables, which are usually reformulated linearly using a big-M formulation. In this work, we challenge this longstanding modeling practice and express the logical constraints in a non-linear way. By imposing a regularization condition, we reformulate these problems as convex binary optimization problems, which are solvable using an outer-approximation procedure. In numerical experiments, we establish that a general-purpose numerical strategy, which combines cutting-plane, first-order and local search methods, solves these problems faster and at a larger scale than state-of-the-art mixed-integer linear or second-order cone methods. Our approach successfully solves network design problems with 100s of nodes and provides solutions up to 40\% better than the state-of-the-art; sparse portfolio selection problems with up to 3,200 securities compared with 400 securities for previous attempts; and sparse regression problems with up to 100,000 covariates.

optimization problem, regularization, survey article, (18 more...)

1907.02109

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.81)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)