AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Joint variable and rank selection for parsimonious estimation of high-dimensional matrices

Bunea, Florentina, She, Yiyuan, Wegkamp, Marten H.

arXiv.org Machine LearningFeb-13-2013

We propose dimension reduction methods for sparse, high-dimensional multivariate response regression models. Both the number of responses and that of the predictors may exceed the sample size. Sometimes viewed as complementary, predictor selection and rank reduction are the most popular strategies for obtaining lower-dimensional approximations of the parameter matrix in such models. We show in this article that important gains in prediction accuracy can be obtained by considering them jointly. We motivate a new class of sparse multivariate regression models, in which the coefficient matrix has low rank and zero rows or can be well approximated by such a matrix. Next, we introduce estimators that are based on penalized least squares, with novel penalties that impose simultaneous row and rank restrictions on the coefficient matrix. We prove that these estimators indeed adapt to the unknown matrix sparsity and have fast rates of convergence. We support our theoretical results with an extensive simulation study and two data analyses.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/12-AOS1039

1110.3556

Country: North America > United States > Florida (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Efficiency for Regularization Parameter Selection in Penalized Likelihood Estimation of Misspecified Models

Flynn, Cheryl J., Hurvich, Clifford M., Simonoff, Jeffrey S.

arXiv.org Machine LearningFeb-8-2013

It has been shown that AIC-type criteria are asymptotically efficient selectors of the tuning parameter in non-concave penalized regression methods under the assumption that the population variance is known or that a consistent estimator is available. We relax this assumption to prove that AIC itself is asymptotically efficient and we study its performance in finite samples. In classical regression, it is known that AIC tends to select overly complex models when the dimension of the maximum candidate model is large relative to the sample size. Simulation studies suggest that AIC suffers from the same shortcomings when used in penalized regression. We therefore propose the use of the classical corrected AIC (AICc) as an alternative and prove that it maintains the desired asymptotic properties. To broaden our results, we further prove the efficiency of AIC for penalized likelihood methods in the context of generalized linear models with no dispersion parameter. Similar results exist in the literature but only for a restricted set of candidate models. By employing results from the classical literature on maximum-likelihood estimation in misspecified models, we are able to establish this result for a general set of candidate models. We use simulations to assess the performance of AIC and AICc, as well as that of other selectors, in finite samples for both SCAD-penalized and Lasso regressions and a real data example is considered.

artificial intelligence, assumption, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1080/01621459.2013.801775

1302.2068

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Distribution-Free Distribution Regression

Poczos, Barnabas, Rinaldo, Alessandro, Singh, Aarti, Wasserman, Larry

arXiv.org Machine LearningFeb-1-2013

'Distribution regression' refers to the situation where a response Y depends on a covariate P where P is a probability distribution. The model is Y f(P) µ where f is an unknown regression function and µ is a random error. Typically, we do not observe P directly, but rather, we observe a sample from P. In this paper we develop theory and methods for distribution-free versions of distribution regression. This means that we do not make distributional assumptions about the error term µ and covariate P. We prove that when the effective dimension is small enough (as measured by the doubling dimension), then the excess prediction risk converges to zero with a polynomial rate.

artificial intelligence, estimator, machine learning, (16 more...)

arXiv.org Machine Learning

1302.0082

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Hierarchical Mixtures-of-Experts for Exponential Family Regression Models with Generalized Linear Mean Functions: A Survey of Approximation and Consistency Results

Jiang, Wenxin, Tanner, Martin A.

arXiv.org Machine LearningJan-30-2013

We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form psi(ga+fx^Tfgb) are mixed. Here psi(...) is the inverse link function. Suppose the true response y follows an exponential family regression model with mean function belonging to a class of smooth functions of the form psi(h(fx)) where h(...)in W_2^infty (a Sobolev class over [0,1]^{s}). It is shown that the HME probability density functions can approximate the true density, at a rate of O(m^{-2/s}) in L_p norm, and at a rate of O(m^{-4/s}) in Kullback-Leibler divergence. These rates can be achieved within the family of HME structures with no more than s-layers, where s is the dimension of the predictor fx. It is also shown that likelihood-based inference based on HME is consistent in recovering the truth, in the sense that as the sample size n and the number of experts m both increase, the mean square error of the predicted mean response goes to zero. Conditions for such results to hold are stated and discussed.

artificial intelligence, machine learning, survey article, (15 more...)

arXiv.org Machine Learning

1301.739

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.68)

Genre:

Research Report (0.64)
Overview (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)

Add feedback

Active Learning of Inverse Models with Intrinsically Motivated Goal Exploration in Robots

Baranes, Adrien, Oudeyer, Pierre-Yves

arXiv.org Artificial IntelligenceJan-21-2013

We introduce the Self-Adaptive Goal Generation - Robust Intelligent Adaptive Curiosity (SAGG-RIAC) architecture as an intrinsi- cally motivated goal exploration mechanism which allows active learning of inverse models in high-dimensional redundant robots. This allows a robot to efficiently and actively learn distributions of parameterized motor skills/policies that solve a corresponding distribution of parameterized tasks/goals. The architecture makes the robot sample actively novel parameterized tasks in the task space, based on a measure of competence progress, each of which triggers low-level goal-directed learning of the motor policy pa- rameters that allow to solve it. For both learning and generalization, the system leverages regression techniques which allow to infer the motor policy parameters corresponding to a given novel parameterized task, and based on the previously learnt correspondences between policy and task parameters. We present experiments with high-dimensional continuous sensorimotor spaces in three different robotic setups: 1) learning the inverse kinematics in a highly-redundant robotic arm, 2) learning omnidirectional locomotion with motor primitives in a quadruped robot, 3) an arm learning to control a fishing rod with a flexible wire. We show that 1) exploration in the task space can be a lot faster than exploration in the actuator space for learning inverse models in redundant robots; 2) selecting goals maximizing competence progress creates developmental trajectories driving the robot to progressively focus on tasks of increasing complexity and is statistically significantly more efficient than selecting tasks randomly, as well as more efficient than different standard active motor babbling methods; 3) this architecture allows the robot to actively discover which parts of its task space it can learn to reach and which part it cannot.

artificial intelligence, machine learning, task space, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.robot.2012.05.008

1301.4862

Country:

Europe (1.00)
North America > United States > New York (0.28)
North America > United States > Wisconsin (0.27)

Genre: Research Report > New Finding (0.45)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.92)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

Feature Selection and Dualities in Maximum Entropy Discrimination

Jebara, Tony S., Jaakkola, Tommi S.

arXiv.org Machine LearningJan-16-2013

Incorporating feature selection into a classification or regression method often carries a number of advantages. In this paper we formalize feature selection specifically from a discriminative perspective of improving classification/regression accuracy. The feature selection method is developed as an extension to the recently proposed maximum entropy discrimination (MED) framework. We describe MED as a flexible (Bayesian) regularization approach that subsumes, e.g., support vector classification, regression and exponential family models. For brevity, we restrict ourselves primarily to feature selection in the context of linear classification/regression methods and demonstrate that the proposed approach indeed carries substantial improvements in practice. Moreover, we discuss and develop various extensions of feature selection, including the problem of dealing with example specific but unobserved degrees of freedom -- alignments or invariants.

artificial intelligence, feature selection, machine learning, (18 more...)

arXiv.org Machine Learning

1301.3865

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.89)

Add feedback

Multiple functional regression with both discrete and continuous covariates

Kadri, Hachem, Preux, Philippe, Duflos, Emmanuel, Canu, Stéphane

arXiv.org Machine LearningJan-12-2013

In this paper we present a nonparametric method for extending functional regression methodology to the situation where more than one functional covariate is used to predict a functional response. Borrowing the idea from Kadri et al. (2010a), the method, which support mixed discrete and continuous explanatory variables, is based on estimating a function-valued function in reproducing kernel Hilbert spaces by virtue of positive operator-valued kernels.

artificial intelligence, machine learning, regression, (16 more...)

arXiv.org Machine Learning

1301.2656

Country: Europe > France (0.30)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.75)

Add feedback

Robust High Dimensional Sparse Regression and Matching Pursuit

Chen, Yudong, Caramanis, Constantine, Mannor, Shie

arXiv.org Machine LearningJan-12-2013

We consider high dimensional sparse regression, and develop strategies able to deal with arbitrary -- possibly, severe or coordinated -- errors in the covariance matrix $X$. These may come from corrupted data, persistent experimental errors, or malicious respondents in surveys/recommender systems, etc. Such non-stochastic error-in-variables problems are notoriously difficult to treat, and as we demonstrate, the problem is particularly pronounced in high-dimensional settings where the primary goal is {\em support recovery} of the sparse regressor. We develop algorithms for support recovery in sparse regression, when some number $n_1$ out of $n+n_1$ total covariate/response pairs are {\it arbitrarily (possibly maliciously) corrupted}. We are interested in understanding how many outliers, $n_1$, we can tolerate, while identifying the correct support. To the best of our knowledge, neither standard outlier rejection techniques, nor recently developed robust regression algorithms (that focus only on corrupted response variables), nor recent algorithms for dealing with stochastic noise or erasures, can provide guarantees on support recovery. Perhaps surprisingly, we also show that the natural brute force algorithm that searches over all subsets of $n$ covariate/response pairs, and all subsets of possible support coordinates in order to minimize regression error, is remarkably poor, unable to correctly identify the support with even $n_1 = O(n/k)$ corrupted points, where $k$ is the sparsity. This is true even in the basic setting we consider, where all authentic measurements and noise are independent and sub-Gaussian. In this setting, we provide a simple algorithm -- no more computationally taxing than OMP -- that gives stronger performance guarantees, recovering the support with up to $n_1 = O(n/(\sqrt{k} \log p))$ corrupted points, where $p$ is the dimension of the signal to be recovered.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1301.2725

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

Support Vector Regression for Right Censored Data

Goldberg, Yair, Kosorok, Michael R.

arXiv.org Machine LearningJan-12-2013

In many medical studies, estimating the failure time distribution function, or quantities that depend on this distribution, as a function of patient demographic and prognostic variables, is of central importance for risk assessment and health planing. Frequently, such data is subject to right censoring. The goal of this paper is to develop tools for analyzing such data using machine learning techniques. Traditional approaches to right censored failure time analysis include using parametric models, such as the Weibull distribution, and semiparametric models such as proportional hazard models (see Lawless, 2003, for both). Even when less stringent models--such as nonparametric estimation--are used, it is typically assumed that the distribution function is smooth in both time and covariates (Dabrowska, 1987; Gonzalez-Manteiga and Cadarso-Suarez, 1994). These assumptions seem restrictive, especially when considering today's high-dimensional data settings.

artificial intelligence, goldberg and kosorok svm, machine learning, (16 more...)

arXiv.org Machine Learning

1202.513

Country: North America > United States (0.92)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.82)

Add feedback

Semi-Instrumental Variables: A Test for Instrument Admissibility

Chu, Tianjiao, Scheines, Richard, Spirtes, Peter L.

arXiv.org Artificial IntelligenceJan-10-2013

In a causal graphical model, an instrument for a variable X and its effect Y is a random variable that is a cause of X and independent of all the causes of Y except X. (Pearl (1995), Spirtes et al (2000)). Instrumental variables can be used to estimate how the distribution of an effect will respond to a manipulation of its causes, even in the presence of unmeasured common causes (confounders). In typical instrumental variable estimation, instruments are chosen based on domain knowledge. There is currently no statistical test for validating a variable as an instrument. In this paper, we introduce the concept of semi-instrument, which generalizes the concept of instrument. We show that in the framework of additive models, under certain conditions, we can test whether a variable is semi-instrumental. Moreover, adding some distribution assumptions, we can test whether two semi-instruments are instrumental. We give algorithms to estimate the p-value that a random variable is semi-instrumental, and the p-value that two semi-instruments are both instrumental. These algorithms can be used to test the experts' choice of instruments, or to identify instruments automatically.

artificial intelligence, instrument, machine learning, (14 more...)

arXiv.org Artificial Intelligence

1301.2261

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback