AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Model Selection for High-Dimensional Regression under the Generalized Irrepresentability Condition

Neural Information Processing SystemsMar-13-2024, 19:20:43 GMT

In the high-dimensional regression model a response variable is linearly related to p covariates, but the sample size n is smaller than p. We assume that only a small subset of covariates is'active' (i.e., the corresponding coefficients are non-zero), and consider the model-selection problem of identifying the active covariates.

estimator, generalized irrepresentability condition, irrepresentability condition, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

9908279ebbf1f9b250ba689db6a0222b-Reviews.html

Neural Information Processing SystemsMar-13-2024, 18:52:12 GMT

The authors present a new method for robust principal component regression for non-Gaussian data. First, they show that principal component regression outperforms classical linear regression when the dimensionality and the sample size are allowed to increase by being insensitive to collinearity and exploiting low rank structure. They demonstrate their theoretical calculations by sweeping parameters and show that mean square error follows theory. Then the authors develop a new method for doing principal component regression by assuming the random vector and noise are elliptically distributed, a more general assumption than the standard Gaussian assumption. They demonstrate that this more general method outperforms traditional principal component regression on different elliptical distributions (multivariate-t, EC1, EC2), and show that it achieves similar performance for Gaussian distributions.

principal component regression, regression, sparsity pattern, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.61)

Add feedback

Robust Sparse Principal Component Regression under the High Dimensional Elliptical Model

Neural Information Processing SystemsMar-13-2024, 18:52:11 GMT

In this paper we focus on the principal component regression and its application to high dimension non-Gaussian data. The major contributions are two folds. First, in low dimensions and under the Gaussian model, by borrowing the strength from recent development in minimax optimal principal component estimation, we first time sharply characterize the potential advantage of classical principal component regression over least square estimation. Secondly, we propose and analyze a new robust sparse principal component regression on high dimensional elliptically distributed data. The elliptical distribution is a semiparametric generalization of the Gaussian, including many well known distributions such as multivariate Gaussian, rank-deficient Gaussian, t, Cauchy, and logistic. It allows the random vector to be heavy tailed and have tail dependence. These extra flexibilities make it very suitable for modeling finance and biomedical imaging data. Under the elliptical model, we prove that our method can estimate the regression coefficients in the optimal parametric rate and therefore is a good alternative to the Gaussian based methods. Experiments on synthetic and real world data are conducted to illustrate the empirical usefulness of the proposed method.

component regression, principal component regression, regression, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Maryland > Baltimore (0.04)
Asia > Middle East > Jordan (0.04)

Industry:

Health & Medicine (0.88)
Banking & Finance > Trading (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

Add feedback

Sketching Structured Matrices for Faster Nonlinear Regression

Neural Information Processing SystemsMar-13-2024, 18:51:06 GMT

These problems involve Vandermonde matrices which arise naturally in various statistical modeling settings, including classical polynomial fitting problems, additive models and approximations to recently developed randomized techniques for scalable kernel methods. We show that this structure can be exploited to further accelerate the solution of the regression problem, achieving running times that are faster than "input sparsity".

matrix, regression, regression problem, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)

Genre: Research Report (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Dirty Statistical Models

Neural Information Processing SystemsMar-13-2024, 18:22:20 GMT

We provide a unified framework for the high-dimensional analysis of "superposition-structured" or "dirty" statistical models: where the model parameters are a superposition of structurally constrained parameters. We allow for any number and types of structures, and any statistical model. We consider the general class of M-estimators that minimize the sum of any loss function, and an instance of what we call a "hybrid" regularization, that is the infimal convolution of weighted regularization functions, one for each structural component. We provide corollaries showcasing our unified framework for varied statistical models such as linear regression, multiple regression and principal component analysis, over varied superposition structures.

matrix, regularization function, statistical model, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.38)

Add feedback

Lasso Screening Rules via Dual Polytope Projection

Neural Information Processing SystemsMar-13-2024, 18:21:45 GMT

Lasso is a widely used regression technique to find sparse representations. When the dimension of the feature space and the number of samples are extremely large, solving the Lasso problem remains challenging. To improve the efficiency of solving large-scale Lasso problems, El Ghaoui and his colleagues have proposed the SAFE rules which are able to quickly identify the inactive predictors, i.e., predictors that have 0 components in the solution vector. Then, the inactive predictors or features can be removed from the optimization problem to reduce its scale. By transforming the standard Lasso to its dual form, it can be shown that the inactive predictors include the set of inactive constraints on the optimal dual solution.

dpp rule, lasso problem, predictor, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Arizona > Maricopa County > Tempe (0.04)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Dropout Training as Adaptive Regularization Stefan Wager, Sida Wang, and Percy Liang Departments of Statistics

Neural Information Processing SystemsMar-13-2024, 15:52:32 GMT

Dropout and other feature noising schemes control overfitting by artificially corrupting the training data. For generalized linear models, dropout performs a form of adaptive regularization.

logistic regression, regression, regularization, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report > New Finding (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)

Add feedback

Integrated Non-Factorized Variational Inference

Neural Information Processing SystemsMar-13-2024, 14:56:28 GMT

We present a non-factorized variational method for full posterior inference in Bayesian hierarchical models, with the goal of capturing the posterior variable dependencies via efficient and possibly parallel computation. Our approach unifies the integrated nested Laplace approximation (INLA) under the variational framework.

approximation, inf vb 1, inference, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey > Hudson County > Secaucus (0.04)
(2 more...)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Sparse Bayesian structure learning with dependent relevance determination prior Anqi Wu1 Mijung Park 2 Jonathan W. Pillow

Neural Information Processing SystemsMar-13-2024, 14:11:15 GMT

In many problem settings, parameter vectors are not merely sparse, but dependent in such a way that non-zero coefficients tend to cluster together. We refer to this form of dependency as "region sparsity". Classical sparse regression methods, such as the lasso and automatic relevance determination (ARD), model parameters as independent a priori, and therefore do not exploit such dependencies. Here we introduce a hierarchical model for smooth, region-sparse weight vectors and tensors in a linear regression setting. Our approach represents a hierarchical extension of the relevance determination framework, where we add a transformed Gaussian process to model the dependencies between the prior variances of regression weights. We combine this with a structured model of the prior variances of Fourier coefficients, which eliminates unnecessary high frequencies. The resulting prior encourages weights to be region-sparse in two different bases simultaneously. We develop efficient approximate inference methods and show substantial improvements over comparable methods (e.g., group lasso and smooth RVM) for both simulated and real datasets from brain imaging.

likelihood, smoothness, sparsity, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry:

Health & Medicine > Health Care Technology (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.66)
Health & Medicine > Diagnostic Medicine > Imaging (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Add feedback

Constant Nullspace Strong Convexity and Fast Convergence of Proximal Methods under High-Dimensional Settings

Neural Information Processing SystemsMar-13-2024, 14:10:55 GMT

State of the art statistical estimators for high-dimensional problems take the form of regularized, and hence non-smooth, convex programs. A key facet of these statistical estimation problems is that these are typically not strongly convex under a high-dimensional sampling regime when the Hessian matrix becomes rankdeficient. Under vanilla convexity however, proximal optimization methods attain only a sublinear rate. In this paper, we investigate a novel variant of strong convexity, which we call Constant Nullspace Strong Convexity (CNSC), where we require that the objective function be strongly convex only over a constant subspace. As we show, the CNSC condition is naturally satisfied by high-dimensional statistical estimators. We then analyze the behavior of proximal methods under this CNSC condition: we show global linear convergence of Proximal Gradient and local quadratic convergence of Proximal Newton Method, when the regularization function comprising the statistical estimator is decomposable. We corroborate our theory via numerical experiments, and show a qualitative difference in the convergence rates of the proximal algorithms when the loss function does satisfy the CNSC condition.

cnsc condition, constant nullspace strong convexity, convergence, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback