AITopics | Regression

Collaborating Authors

Regression

News Overviews Instructional Materials AI-Alerts Classics

Learning Output Embeddings in Structured Prediction

Brogat-Motte, Luc, Rudi, Alessandro, Brouard, Céline, Rousu, Juho, d'Alché-Buc, Florence

arXiv.org Machine LearningNov-2-2020

A powerful and flexible approach to structured prediction consists in embedding the structured objects to be predicted into a feature space of possibly infinite dimension by means of output kernels, and then, solving a regression problem in this output space. A prediction in the original space is computed by solving a pre-image problem. In such an approach, the embedding, linked to the target loss, is defined prior to the learning phase. In this work, we propose to jointly learn a finite approximation of the output embedding and the regression function into the new feature space. For that purpose, we leverage a priori information on the outputs and also unexploited unsupervised output data, which are both often available in structured prediction problems. We prove that the resulting structured predictor is a consistent estimator, and derive an excess risk bound. Moreover, the novel structured prediction tool enjoys a significantly smaller computational complexity than former output kernel methods. The approach empirically tested on various structured prediction problems reveals to be versatile and able to handle large datasets.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Machine Learning

2007.14703

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

On the Optimal Weighted $\ell_2$ Regularization in Overparameterized Linear Regression

Wu, Denny, Xu, Ji

arXiv.org Machine LearningNov-2-2020

We consider the linear model $\mathbf{y} = \mathbf{X} \mathbf{\beta}_\star + \mathbf{\epsilon}$ with $\mathbf{X}\in \mathbb{R}^{n\times p}$ in the overparameterized regime $p>n$. We estimate $\mathbf{\beta}_\star$ via generalized (weighted) ridge regression: $\hat{\mathbf{\beta}}_\lambda = \left(\mathbf{X}^T\mathbf{X} + \lambda \mathbf{\Sigma}_w\right)^\dagger \mathbf{X}^T\mathbf{y}$, where $\mathbf{\Sigma}_w$ is the weighting matrix. Under a random design setting with general data covariance $\mathbf{\Sigma}_x$ and anisotropic prior on the true coefficients $\mathbb{E}\mathbf{\beta}_\star\mathbf{\beta}_\star^T = \mathbf{\Sigma}_\beta$, we provide an exact characterization of the prediction risk $\mathbb{E}(y-\mathbf{x}^T\hat{\mathbf{\beta}}_\lambda)^2$ in the proportional asymptotic limit $p/n\rightarrow \gamma \in (1,\infty)$. Our general setup leads to a number of interesting findings. We outline precise conditions that decide the sign of the optimal setting $\lambda_{\rm opt}$ for the ridge parameter $\lambda$ and confirm the implicit $\ell_2$ regularization effect of overparameterization, which theoretically justifies the surprising empirical observation that $\lambda_{\rm opt}$ can be negative in the overparameterized regime. We also characterize the double descent phenomenon for principal component regression (PCR) when both $\mathbf{X}$ and $\mathbf{\beta}_\star$ are anisotropic. Finally, we determine the optimal weighting matrix $\mathbf{\Sigma}_w$ for both the ridgeless ($\lambda\to 0$) and optimally regularized ($\lambda = \lambda_{\rm opt}$) case, and demonstrate the advantage of the weighted objective over standard ridge regression and PCR.

artificial intelligence, machine learning, regression, (16 more...)

arXiv.org Machine Learning

2006.058

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.64)

Add feedback

Evaluation Metrics for Regression Analysis

#artificialintelligenceNov-1-2020, 10:20:35 GMT

These terms will come up, and it's good to get familiar with them if you aren't already: Goodness of fit is typically a term used to describe how well a dataset aligns with a certain statistical distribution. Here, we're going to think of it as a way of describing how well our model is fitted to our data. If we can think about our regression model in terms of the imaginary "best-fit" line it produces, then it makes sense that we would want to know how well this line matches our data. This goodness of fit can be quantified in a variety of ways, but the R² and the adjusted R² score are two of the most common methods for describing how well our model is capturing the variance in our target data. R² -- also called the coefficient of determination -- is a statistical measure representing the amount of variance for a dependent variable that is captured by your model's predictions.

evaluation metric, prediction, regression analysis, (10 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.41)
Research Report > Experimental Study (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.74)

Add feedback

DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural Networks

Xu, Shiyun

arXiv.org Machine LearningNov-1-2020

Recent years have witnessed strong empirical performance of over-parameterized neural networks on various tasks and many advances in the theory, e.g. the universal approximation and provable convergence to global minimum. In this paper, we incorporate over-parameterized neural networks into semi-parametric models to bridge the gap between inference and prediction, especially in the high dimensional linear problem. By doing so, we can exploit a wide class of networks to approximate the nuisance functions and to estimate the parameters of interest consistently. Therefore, we may offer the best of two worlds: the universal approximation ability from neural networks and the interpretability from classic ordinary linear model, leading to valid inference and accurate prediction. We show the theoretical foundations that make this possible and demonstrate with numerical experiments. Furthermore, we propose a framework, DebiNet, in which we plug-in arbitrary feature selection methods to our semi-parametric neural network and illustrate that our framework debiases the regularized estimators and performs well, in terms of the post-selection inference and the generalization error.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Machine Learning

2011.00417

Country:

Asia > Taiwan (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Learning Deep Features in Instrumental Variable Regression

Xu, Liyuan, Chen, Yutian, Srinivasan, Siddarth, de Freitas, Nando, Doucet, Arnaud, Gretton, Arthur

arXiv.org Machine LearningNov-1-2020

Instrumental variable (IV) regression is a standard strategy for learning causal relationships between confounded treatment and outcome variables from observational data by utilizing an instrumental variable, which affects the outcome only through the treatment. In classical IV regression, learning proceeds in two stages: stage 1 performs linear regression from the instrument to the treatment; and stage 2 performs linear regression from the treatment to the outcome, conditioned on the instrument. We propose a novel method, deep feature instrumental variable regression (DFIV), to address the case where relations between instruments, treatments, and outcomes may be nonlinear. In this case, deep neural nets are trained to define informative nonlinear features on the instruments and treatments. We propose an alternating training regime for these features to ensure good end-to-end performance when composing stages 1 and 2, thus obtaining highly flexible feature maps in a computationally efficient manner. DFIV outperforms recent state-of-the-art methods on challenging IV benchmarks, including settings involving high dimensional image data. DFIV also exhibits competitive performance in off-policy policy evaluation for reinforcement learning, which can be understood as an IV regression task.

machine learning, regression, reinforcement learning, (16 more...)

arXiv.org Machine Learning

2010.07154

Country: Asia > Vietnam (0.04)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

Add feedback

Your Data Science Toolbox -- What is Inside?

#artificialintelligenceOct-31-2020, 16:55:10 GMT

Data science is a very broad multi-disciplinary field that includes several subdivisions such as data visualization, machine learning, and artificial intelligence. Due to the broadness of the field and because data science is constantly changing due to technological innovations and the development of new algorithms, a successful data scientist has to maintain a big and updated toolbox at all times. Keep in mind that as a data scientist, you can only perform tasks that you have the right tools for. This article will discuss several tools that one can include in their data science toolbox. Knowledge-based tools can be grouped into three main categories based on the level of data science tasks involved: level 1 (basic level); level 2 (intermediate level); and level 3 (advanced level). Basic tools are tools that would enable one to perform level 1 tasks.

artificial intelligence, data scientist, machine learning, (15 more...)

#artificialintelligence

Country: North America > United States > California > Santa Clara County > Palo Alto (0.05)

Industry: Information Technology (0.51)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)

Add feedback

Error-Correcting Output Codes (ECOC) for Machine Learning

#artificialintelligenceOct-31-2020, 06:45:38 GMT

Machine learning algorithms, like logistic regression and support vector machines, are designed for two-class (binary) classification problems. As such, these algorithms must either be modified for multi-class (more than two) classification problems or not used at all. The Error-Correcting Output Codes method is a technique that allows a multi-class classification problem to be reframed as multiple binary classification problems, allowing the use of native binary classification models to be used directly. Unlike one-vs-rest and one-vs-one methods that offer a similar solution by dividing a multi-class classification problem into a fixed number of binary classification problems, the error-correcting output codes technique allows each class to be encoded as an arbitrary number of binary classification problems. When an overdetermined representation is used, it allows the extra models to act as "error-correction" predictions that can result in better predictive performance.

artificial intelligence, classification problem, machine learning, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.37)

Add feedback

On Optimality of Meta-Learning in Fixed-Design Regression with Weighted Biased Regularization

Konobeev, Mikhail, Kuzborskij, Ilja, Szepesvári, Csaba

arXiv.org Machine LearningOct-31-2020

We consider a fixed-design linear regression in the meta-learning model of Baxter (2000) and establish a problem-dependent finite-sample lower bound on the transfer risk (risk on a newly observed task) valid for all estimators. Moreover, we prove that a weighted form of a biased regularization - a popular technique in transfer and meta-learning - is optimal, i.e. it enjoys a problem-dependent upper bound on the risk matching our lower bound up to a constant. Thus, our bounds characterize meta-learning linear regression problems and reveal a fine-grained dependency on the task structure. Our characterization suggests that in the non-asymptotic regime, for a sufficiently large number of tasks, meta-learning can be considerably superior to a single-task learning. Finally, we propose a practical adaptation of the optimal estimator through Expectation-Maximization procedure and show its effectiveness in series of experiments.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

2011.00344

Country:

North America > Canada > Alberta (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Education (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Add feedback

Strongly universally consistent nonparametric regression and classification with privatised data

Berrett, Thomas, Györfi, László, Walk, Harro

arXiv.org Machine LearningOct-31-2020

In this paper we revisit the classical problem of nonparametric regression, but impose local differential privacy constraints. Under such constraints, the raw data $(X_1,Y_1),\ldots,(X_n,Y_n)$, taking values in $\mathbb{R}^d \times \mathbb{R}$, cannot be directly observed, and all estimators are functions of the randomised output from a suitable privacy mechanism. The statistician is free to choose the form of the privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of a feature vector $X_i$ and to the value of its response variable $Y_i$. Based on this randomised data, we design a novel estimator of the regression function, which can be viewed as a privatised version of the well-studied partitioning regression estimator. The main result is that the estimator is strongly universally consistent. Our methods and analysis also give rise to a strongly universally consistent binary classification rule for locally differentially private data.

artificial intelligence, machine learning, privacy mechanism, (14 more...)

arXiv.org Machine Learning

2011.00216

Country:

Europe > United Kingdom (0.28)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Europe > Hungary > Budapest > Budapest (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

Estimating NBA players salary share according to their performance on court: A machine learning approach

Papadaki, Ioanna, Tsagris, Michail

arXiv.org Machine LearningOct-31-2020

Professional athletes' field performance and salaries is a topic that has attracted the interest of numerous researchers (Garris and Wilkes, 2017, Olbrecht, 2009, Vincent and Eastman, 2009, Wiseman and Chatterjee, 2010, Yilmaz and Chatterjee, 2003, Zimmer and Zimmer, 2001). The general question of interest is whether players deserve their salaries based on their performance statistics. We emphasize that this relationship is not linear and hence linear models are bound to fail in capturing the underlying true association. An additional concern, separate from non-linearity, is model predictability for which internal evaluation has limitations and leads to an over-optimistic performance. These and more matters, discussed later, require delicate treatment which, if not properly addressed, will yield erroneous results.

artificial intelligence, machine learning, statistics, (19 more...)

arXiv.org Machine Learning

2007.14694

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report > New Finding (0.93)

Industry: Leisure & Entertainment > Sports > Basketball (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)

Add feedback