Goto

Collaborating Authors

 altmin





Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

SUMMARY: This paper studies the effect of noise correlation in some models of multi-output regression. It argues that a method that does not benefit from the correlation, such as Ordinary Least Squares (OLS), may perform much worse than a method that does, such as Maximum Likelihood Estimation (MLE). For certain linear models (Pooled model and Seemingly Unrelated Regression), which are studied in the paper, the MLE estimator requires the joint optimization of the covariance and regression weights. This is a non-convex problem. Alternative Minimization (AltMin) algorithm is an approach to solve the problem by iteratively optimizing the covariance and the weights.


Alternating minimization for square root principal component pursuit

Deng, Shengxiang, Li, Xudong, Zhang, Yangjing

arXiv.org Machine Learning

Recently, the square root principal component pursuit (SRPCP) model has garnered significant research interest. It is shown in the literature that the SRPCP model guarantees robust matrix recovery with a universal, constant penalty parameter. While its statistical advantages are well-documented, the computational aspects from an optimization perspective remain largely unexplored. In this paper, we focus on developing efficient optimization algorithms for solving the SRPCP problem. Specifically, we propose a tuning-free alternating minimization (AltMin) algorithm, where each iteration involves subproblems enjoying closed-form optimal solutions. Additionally, we introduce techniques based on the variational formulation of the nuclear norm and Burer-Monteiro decomposition to further accelerate the AltMin method. Extensive numerical experiments confirm the efficiency and robustness of our algorithms.


Alternating Estimation for Structured High-Dimensional Multi-Response Models

Sheng Chen, Arindam Banerjee

Neural Information Processing Systems

We consider the problem of learning high-dimensional multi-response linear models with structured parameters. By exploiting the noise correlations among different responses, we propose an alternating estimation (AltEst) procedure to estimate the model parameters based on the generalized Dantzig selector (GDS). Under suitable sample size and resampling assumptions, we show that the error of the estimates generated by AltEst, with high probability, converges linearly to certain minimum achievable level, which can be tersely expressed by a few geometric measures, such as Gaussian width of sets related to the parameter structure. To the best of our knowledge, this is the first non-asymptotic statistical guarantee for such AltEst-type algorithm applied to estimation with general structures.


Efficient Federated Low Rank Matrix Completion

Abbasi, Ahmed Ali, Vaswani, Namrata

arXiv.org Artificial Intelligence

In this work, we develop and analyze a Gradient Descent (GD) based solution, called Alternating GD and Minimization (AltGDmin), for efficiently solving the low rank matrix completion (LRMC) in a federated setting. LRMC involves recovering an $n \times q$ rank-$r$ matrix $\Xstar$ from a subset of its entries when $r \ll \min(n,q)$. Our theoretical guarantees (iteration and sample complexity bounds) imply that AltGDmin is the most communication-efficient solution in a federated setting, is one of the fastest, and has the second best sample complexity among all iterative solutions to LRMC. In addition, we also prove two important corollaries. (a) We provide a guarantee for AltGDmin for solving the noisy LRMC problem. (b) We show how our lemmas can be used to provide an improved sample complexity guarantee for AltMin, which is the fastest centralized solution.


An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression

Chen, Sheng, Banerjee, Arindam

Neural Information Processing Systems

Multi-response linear models aggregate a set of vanilla linear models by assuming correlated noise across them, which has an unknown covariance structure. To find the coefficient vector, estimators with a joint approximation of the noise covariance are often preferred than the simple linear regression in view of their superior empirical performance, which can be generally solved by alternating-minimization type procedures. Due to the non-convex nature of such joint estimators, the theoretical justification of their efficiency is typically challenging. The existing analyses fail to fully explain the empirical observations due to the assumption of resampling on the alternating procedures, which requires access to fresh samples in each iteration. In this work, we present a resampling-free analysis for the alternating minimization algorithm applied to the multi-response regression. In particular, we focus on the high-dimensional setting of multi-response linear models with structured coefficient parameter, and the statistical error of the parameter can be expressed by the complexity measure, Gaussian width, which is related to the assumed structure. More importantly, to the best of our knowledge, our result reveals for the first time that the alternating minimization with random initialization can achieve the same performance as the well-initialized one when solving this multi-response regression problem. Experimental results support our theoretical developments.


An Improved Analysis of Alternating Minimization for Structured Multi-Response Regression

Chen, Sheng, Banerjee, Arindam

Neural Information Processing Systems

Multi-response linear models aggregate a set of vanilla linear models by assuming correlated noise across them, which has an unknown covariance structure. To find the coefficient vector, estimators with a joint approximation of the noise covariance are often preferred than the simple linear regression in view of their superior empirical performance, which can be generally solved by alternating-minimization type procedures. Due to the non-convex nature of such joint estimators, the theoretical justification of their efficiency is typically challenging. The existing analyses fail to fully explain the empirical observations due to the assumption of resampling on the alternating procedures, which requires access to fresh samples in each iteration. In this work, we present a resampling-free analysis for the alternating minimization algorithm applied to the multi-response regression. In particular, we focus on the high-dimensional setting of multi-response linear models with structured coefficient parameter, and the statistical error of the parameter can be expressed by the complexity measure, Gaussian width, which is related to the assumed structure. More importantly, to the best of our knowledge, our result reveals for the first time that the alternating minimization with random initialization can achieve the same performance as the well-initialized one when solving this multi-response regression problem. Experimental results support our theoretical developments.


Alternating Estimation for Structured High-Dimensional Multi-Response Models

Chen, Sheng, Banerjee, Arindam

Neural Information Processing Systems

We consider the problem of learning high-dimensional multi-response linear models with structured parameters. By exploiting the noise correlations among different responses, we propose an alternating estimation (AltEst) procedure to estimate the model parameters based on the generalized Dantzig selector (GDS). Under suitable sample size and resampling assumptions, we show that the error of the estimates generated by AltEst, with high probability, converges linearly to certain minimum achievable level, which can be tersely expressed by a few geometric measures, such as Gaussian width of sets related to the parameter structure. To the best of our knowledge, this is the first non-asymptotic statistical guarantee for such AltEst-type algorithm applied to estimation with general structures.