AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Efficient Regularized Least-Squares Algorithms for Conditional Ranking on Relational Data

Pahikkala, Tapio, Airola, Antti, Stock, Michiel, De Baets, Bernard, Waegeman, Willem

arXiv.org Machine LearningJun-8-2013

In domains like bioinformatics, information retrieval and social network analysis, one can find learning tasks where the goal consists of inferring a ranking of objects, conditioned on a particular target object. We present a general kernel framework for learning conditional rankings from various types of relational data, where rankings can be conditioned on unseen data objects. We propose efficient algorithms for conditional ranking by optimizing squared regression and ranking loss functions. We show theoretically, that learning with the ranking loss is likely to generalize better than with the regression loss. Further, we prove that symmetry or reciprocity properties of relations can be efficiently enforced in the learned models. Experiments on synthetic and real-world data illustrate that the proposed methods deliver state-of-the-art performance in terms of predictive power and computational efficiency. Moreover, we also show empirically that incorporating symmetry or reciprocity properties can improve the generalization performance.

data mining, machine learning, relation, (22 more...)

arXiv.org Machine Learning

1209.4825

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(4 more...)

Add feedback

Expectation-maximization for logistic regression

Scott, James G., Sun, Liang

arXiv.org Machine LearningMay-31-2013

We present a family of expectation-maximization (EM) algorithms for binary and negative-binomial logistic regression, drawing a sharp connection with the variational-Bayes algorithm of Jaakkola and Jordan (2000). Indeed, our results allow a version of this variational-Bayes approach to be re-interpreted as a true EM algorithm. We study several interesting features of the algorithm, and of this previously unrecognized connection with variational Bayes. We also generalize the approach to sparsity-promoting priors, and to an online method whose convergence properties are easily established. This latter method compares favorably with stochastic-gradient descent in situations with marked collinearity.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1306.004

Country: Asia > Middle East > Jordan (0.24)

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

An Ensemble Approach to Instance-Based Regression Using Stretched Neighborhoods

Jalali, Vahid (Indiana University) | Leake, David (Indiana University)

AAAI ConferencesMay-19-2013

Instance-based regression methods generate solutions from prior solutions within a neighborhood of the input query. Their performance depends on both neighborhood selection criteria and on the method for generating new solutions from the values of prior instances. This paper proposes a new approach to addressing both problems, in which solutions are generated by an ensemble of solutions of local linear regression models built for a collection of "stretched" neighborhoods of the query. Each neighborhood is generated by relaxing a different dimension of the problem space. The rationale is to enable major change trends along that dimension to have increased influence on the corresponding model. The approach is evaluated for two candidate relaxation approaches, gradient-based and based on fixed profiles, and compared to baselines of k-NN and using a radius-based spherical neighborhood in n-dimensional space. Results in four test domains show up to 15 percent improvement over baselines, and suggest that the approach could be particularly useful in domains for which the space of prior instances is sparse.

ensemble approach, instance-based regression, stretched neighborhood

AAAI Conferences

The Twenty-Sixth International FLAIRS Conference

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Add feedback

Feature Multi-Selection among Subjective Features

Sabato, Sivan, Kalai, Adam

arXiv.org Machine LearningMay-14-2013

When dealing with subjective, noisy, or otherwise nebulous features, the "wisdom of crowds" suggests that one may benefit from multiple judgments of the same feature on the same object. We give theoretically-motivated `feature multi-selection' algorithms that choose, among a large set of candidate features, not only which features to judge but how many times to judge each one. We demonstrate the effectiveness of this approach for linear regression on a crowdsourced learning task of predicting people's height and weight from photos, using features such as 'gender' and 'estimated weight' as well as culturally fraught ones such as 'attractive'.

artificial intelligence, machine learning, social media, (19 more...)

arXiv.org Machine Learning

1302.4297

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Structure Discovery in Nonparametric Regression through Compositional Kernel Search

Duvenaud, David, Lloyd, James Robert, Grosse, Roger, Tenenbaum, Joshua B., Ghahramani, Zoubin

arXiv.org Machine LearningMay-13-2013

Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art. We define a space of kernel structures which are built compositionally by adding and multiplying a small number of base kernels. We present a method for searching over this space of structures which mirrors the scientific discovery process. The learned structures can often decompose functions into interpretable components and enable long-range extrapolation on time-series datasets. Our structure search method outperforms many widely used kernels and kernel combination methods on a variety of prediction tasks.

artificial intelligence, kernel, machine learning, (14 more...)

arXiv.org Machine Learning

1302.4922

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Affine Invariant Divergences associated with Composite Scores and its Applications

Kanamori, Takafumi, Fujisawa, Hironori

arXiv.org Machine LearningMay-11-2013

In statistical analysis, measuring a score of predictive performance is an important task. In many scientific fields, appropriate scores were tailored to tackle the problems at hand. A proper score is a popular tool to obtain statistically consistent forecasts. Furthermore, a mathematical characterization of the proper score was studied. As a result, it was revealed that the proper score corresponds to a Bregman divergence, which is an extension of the squared distance over the set of probability distributions. In the present paper, we introduce composite scores as an extension of the typical scores in order to obtain a wider class of probabilistic forecasting. Then, we propose a class of composite scores, named Holder scores, that induce equivariant estimators. The equivariant estimators have a favorable property, implying that the estimator is transformed in a consistent way, when the data is transformed. In particular, we deal with the affine transformation of the data. By using the equivariant estimators under the affine transformation, one can obtain estimators that do no essentially depend on the choice of the system of units in the measurement. Conversely, we prove that the Holder score is characterized by the invariance property under the affine transformations. Furthermore, we investigate statistical properties of the estimators using Holder scores for the statistical problems including estimation of regression functions and robust parameter estimation, and illustrate the usefulness of the newly introduced scores for statistical forecasting.

artificial intelligence, composite score, machine learning, (16 more...)

arXiv.org Machine Learning

1305.2473

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

APPLE: Approximate Path for Penalized Likelihood Estimators

Yu, Yi, Feng, Yang

arXiv.org Machine LearningMay-4-2013

In high-dimensional data analysis, penalized likelihood estimators are shown to provide superior results in both variable selection and parameter estimation. A new algorithm, APPLE, is proposed for calculating the Approximate Path for Penalized Likelihood Estimators. Both the convex penalty (such as LASSO) and the nonconvex penalty (such as SCAD and MCP) cases are considered. The APPLE efficiently computes the solution path for the penalized likelihood estimator using a hybrid of the modified predictor-corrector method and the coordinate-descent algorithm. APPLE is compared with several well-known packages via simulation and analysis of two gene expression data sets.

artificial intelligence, ebic 0, machine learning, (17 more...)

arXiv.org Machine Learning

1211.0889

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.68)
Health & Medicine > Therapeutic Area > Hematology (0.68)
Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Model Selection for High-Dimensional Regression under the Generalized Irrepresentability Condition

Javanmard, Adel, Montanari, Andrea

arXiv.org Machine LearningMay-2-2013

In the high-dimensional regression model a response variable is linearly related to $p$ covariates, but the sample size $n$ is smaller than $p$. We assume that only a small subset of covariates is `active' (i.e., the corresponding coefficients are non-zero), and consider the model-selection problem of identifying the active covariates. A popular approach is to estimate the regression coefficients through the Lasso ($\ell_1$-regularized least squares). This is known to correctly identify the active set only if the irrelevant covariates are roughly orthogonal to the relevant ones, as quantified through the so called `irrepresentability' condition. In this paper we study the `Gauss-Lasso' selector, a simple two-stage method that first solves the Lasso, and then performs ordinary least squares restricted to the Lasso active set. We formulate `generalized irrepresentability condition' (GIC), an assumption that is substantially weaker than irrepresentability. We prove that, under GIC, the Gauss-Lasso correctly recovers the active set.

artificial intelligence, irrepresentability condition, machine learning, (15 more...)

arXiv.org Machine Learning

1305.0355

Country: North America > United States > California (0.46)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Add feedback

Sparsity regret bounds for individual sequences in online linear regression

Gerchinovitz, Sébastien

arXiv.org Machine LearningApr-12-2013

We consider the problem of online linear regression on arbitrary deterministic sequences when the ambient dimension d can be much larger than the number of time rounds T. We introduce the notion of sparsity regret bound, which is a deterministic online counterpart of recent risk bounds derived in the stochastic setting under a sparsity scenario. We prove such regret bounds for an online-learning algorithm called SeqSEW and based on exponential weighting and data-driven truncation. In a second part we apply a parameter-free version of this algorithm to the stochastic setting (regression model with random design). This yields risk bounds of the same flavor as in Dalalyan and Tsybakov (2012a) but which solve two questions left open therein. In particular our risk bounds are adaptive (up to a logarithmic factor) to the unknown variance of the noise if the latter is Gaussian. We also address the regression model with fixed design.

artificial intelligence, inequality, machine learning, (14 more...)

arXiv.org Machine Learning

1101.1057

Country: Europe (0.27)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Predictive Correlation Screening: Application to Two-stage Predictor Design in High Dimension

Firouzi, Hamed, Rajaratnam, Bala, Hero, Alfred

arXiv.org Machine LearningApr-10-2013

We introduce a new approach to variable selection, called Predictive Correlation Screening, for predictor design. Predictive Correlation Screening (PCS) implements false positive control on the selected variables, is well suited to small sample sizes, and is scalable to high dimensions. We establish asymptotic bounds for Familywise Error Rate (FWER), and resultant mean square error of a linear predictor on the selected variables. We apply Predictive Correlation Screening to the following two-stage predictor design problem. An experimenter wants to learn a multivariate predictor of gene expressions based on successive biological samples assayed on mRNA arrays. She assays the whole genome on a few samples and from these assays she selects a small number of variables using Predictive Correlation Screening. To reduce assay cost, she subsequently assays only the selected variables on the remaining samples, to learn the predictor coefficients. We show superiority of Predictive Correlation Screening relative to LASSO and correlation learning (sometimes popularly referred to in the literature as marginal regression or simple thresholding) in terms of performance and computational complexity.

artificial intelligence, machine learning, matrix, (14 more...)

arXiv.org Machine Learning

1303.2378

Genre: Research Report > Experimental Study (0.49)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Add feedback