Goto

Collaborating Authors

 Regression


"More Than Words": Linking Music Preferences and Moral Values Through Lyrics

arXiv.org Artificial Intelligence

This study explores the association between music preferences and moral values by applying text analysis techniques to lyrics. Harvesting data from a Facebook-hosted application, we align psychometric scores of 1,386 users to lyrics from the top 5 songs of their preferred music artists as emerged from Facebook Page Likes. We extract a set of lyrical features related to each song's overarching narrative, moral valence, sentiment, and emotion. A machine learning framework was designed to exploit regression approaches and evaluate the predictive power of lyrical features for inferring moral values. Results suggest that lyrics from top songs of artists people like inform their morality. Virtues of hierarchy and tradition achieve higher prediction scores ($.20 \leq r \leq .30$) than values of empathy and equality ($.08 \leq r \leq .11$), while basic demographic variables only account for a small part in the models' explainability. This shows the importance of music listening behaviours, as assessed via lyrical preferences, alone in capturing moral values. We discuss the technological and musicological implications and possible future improvements.


Macroeconomic Predictions using Payments Data and Machine Learning

arXiv.org Artificial Intelligence

Predicting the economy's short-term dynamics -- a vital input to economic agents' decision-making process -- often uses lagged indicators in linear models. This is typically sufficient during normal times but could prove inadequate during crisis periods. This paper aims to demonstrate that non-traditional and timely data such as retail and wholesale payments, with the aid of nonlinear machine learning approaches, can provide policymakers with sophisticated models to accurately estimate key macroeconomic indicators in near real-time. Moreover, we provide a set of econometric tools to mitigate overfitting and interpretability challenges in machine learning models to improve their effectiveness for policy use. Our models with payments data, nonlinear methods, and tailored cross-validation approaches help improve macroeconomic nowcasting accuracy up to 40\% -- with higher gains during the COVID-19 period. We observe that the contribution of payments data for economic predictions is small and linear during low and normal growth periods. However, the payments data contribution is large, asymmetrical, and nonlinear during strong negative or positive growth periods.


Linear Model

#artificialintelligence

Important Note: Original article http://ecdicus.com/linear-model/ Linear model is the most widely used model in machine learning. It refers to a model that uses linear combinations of sample features to make predictions. Where $\textbf{w} [w_1,…, w_D] T$ is the D-dimensional weight vector, and b is the bias. The linear regression introduced in the previous chapter is a typical linear model, and $f(\textbf{x};\textbf{w})$ is directly used to predict the output target $y f(\textbf{x};\textbf{w})$. In the classification problem, since the output target y is some discrete labels, and the value range of $f(\textbf{x};\textbf{w})$ is a real number,$f(\textbf{x};\textbf{w})$ cannot be directly used for prediction, and a non-linear decision function needs to be introduced g(.) to predict the output target Where, f(x;w) is also called Discriminant Function.


Dealing with Outliers Using Three Robust Linear Regression Models

#artificialintelligence

Roughly 10% of data was identified as outliers and all the observations introduced were correctly classified as outliers. Them, we can quickly visualize the inliers compared to outliers to see the remaining 26 observations flagged as outliers. Figure 5 shows that the observations located farthest from the hypothetical best-fit line of the original data are considered outliers. The last of the robust regression algorithms available in scikit-learn is the Theil-Sen regression. It is a non-parametric regression method, which means that it makes no assumption about the underlying data distribution. In short, it involves fitting multiple regression models on subsets of the training data and then aggregating the coefficients at the last step.


Bayesian Variable Selection in a Million Dimensions

arXiv.org Machine Learning

Bayesian variable selection is a powerful tool for data analysis, as it offers a principled method for variable selection that accounts for prior information and uncertainty. However, wider adoption of Bayesian variable selection has been hampered by computational challenges, especially in difficult regimes with a large number of covariates P or non-conjugate likelihoods. To scale to the large P regime we introduce an efficient MCMC scheme whose cost per iteration is sublinear in P. In addition we show how this scheme can be extended to generalized linear models for count data, which are prevalent in biology, ecology, economics, and beyond. In particular we design efficient algorithms for variable selection in binomial and negative binomial regression, which includes logistic regression as a special case. In experiments we demonstrate the effectiveness of our methods, including on cancer and maize genomic data.


Embedding Functional Data: Multidimensional Scaling and Manifold Learning

arXiv.org Artificial Intelligence

We adapt concepts, methodology, and theory originally developed in the areas of multidimensional scaling and dimensionality reduction for multivariate data to the functional setting. We focus on classical scaling and Isomap -- prototypical methods that have played important roles in these area -- and showcase their use in the context of functional data analysis. In the process, we highlight the crucial role that the ambient metric plays.


Machine learning in the prediction of cardiac epicardial and mediastinal fat volumes

arXiv.org Artificial Intelligence

We propose a methodology to predict the cardiac epicardial and mediastinal fat volumes in computed tomography images using regression algorithms. The obtained results indicate that it is feasible to predict these fats with a high degree of correlation, thus alleviating the requirement for manual or automatic segmentation of both fat volumes. Instead, segmenting just one of them suffices, while the volume of the other may be predicted fairly precisely. The correlation coefficient obtained by the Rotation Forest algorithm using MLP Regressor for predicting the mediastinal fat based on the epicardial fat was 0.9876, with a relative absolute error of 14.4% and a root relative squared error of 15.7%. The best correlation coefficient obtained in the prediction of the epicardial fat based on the mediastinal was 0.9683 with a relative absolute error of 19.6% and a relative squared error of 24.9%. Moreover, we analysed the feasibility of using linear regressors, which provide an intuitive interpretation of the underlying approximations. In this case, the obtained correlation coefficient was 0.9534 for predicting the mediastinal fat based on the epicardial, with a relative absolute error of 31.6% and a root relative squared error of 30.1%. On the prediction of the epicardial fat based on the mediastinal fat, the correlation coefficient was 0.8531, with a relative absolute error of 50.43% and a root relative squared error of 52.06%. In summary, it is possible to speed up general medical analyses and some segmentation and quantification methods that are currently employed in the state-of-the-art by using this prediction approach, which consequently reduces costs and therefore enables preventive treatments that may lead to a reduction of health problems.


Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms

arXiv.org Artificial Intelligence

High-dimensional models often have a large memory footprint and must be quantized after training before being deployed on resource-constrained edge devices for inference tasks. In this work, we develop an information-theoretic framework for the problem of quantizing a linear regressor learned from training data $(\mathbf{X}, \mathbf{y})$, for some underlying statistical relationship $\mathbf{y} = \mathbf{X}\boldsymbol{\theta} + \mathbf{v}$. The learned model, which is an estimate of the latent parameter $\boldsymbol{\theta} \in \mathbb{R}^d$, is constrained to be representable using only $Bd$ bits, where $B \in (0, \infty)$ is a pre-specified budget and $d$ is the dimension. We derive an information-theoretic lower bound for the minimax risk under this setting and propose a matching upper bound using randomized embedding-based algorithms which is tight up to constant factors. The lower and upper bounds together characterize the minimum threshold bit-budget required to achieve a performance risk comparable to the unquantized setting. We also propose randomized Hadamard embeddings that are computationally efficient and are optimal up to a mild logarithmic factor of the lower bound. Our model quantization strategy can be generalized and we show its efficacy by extending the method and upper-bounds to two-layer ReLU neural networks for non-linear regression. Numerical simulations show the improved performance of our proposed scheme as well as its closeness to the lower bound.


Online Active Regression

arXiv.org Artificial Intelligence

Active regression considers a linear regression problem where the learner receives a large number of data points but can only observe a small number of labels. Since online algorithms can deal with incremental training data and take advantage of low computational cost, we consider an online extension of the active regression problem: the learner receives data points one by one and immediately decides whether it should collect the corresponding labels. The goal is to efficiently maintain the regression of received data points with a small budget of label queries. We propose novel algorithms for this problem under $\ell_p$ loss where $p\in[1,2]$. To achieve a $(1+\epsilon)$-approximate solution, our proposed algorithms only require $\tilde{\mathcal{O}}(\epsilon^{-1} d \log(n\kappa))$ queries of labels, where $n$ is the number of data points and $\kappa$ is a quantity, called the condition number, of the data points. The numerical results verify our theoretical results and show that our methods have comparable performance with offline active regression algorithms.


Discriminative Learning of Similarity and Group Equivariant Representations

arXiv.org Artificial Intelligence

One of the most fundamental problems in machine learning is to compare examples: Given a pair of objects we want to return a value which indicates degree of (dis)similarity. Similarity is often task specific, and pre-defined distances can perform poorly, leading to work in metric learning. However, being able to learn a similarity-sensitive distance function also presupposes access to a rich, discriminative representation for the objects at hand. In this dissertation we present contributions towards both ends. In the first part of the thesis, assuming good representations for the data, we present a formulation for metric learning that makes a more direct attempt to optimize for the k-NN accuracy as compared to prior work. We also present extensions of this formulation to metric learning for kNN regression, asymmetric similarity learning and discriminative learning of Hamming distance. In the second part, we consider a situation where we are on a limited computational budget i.e. optimizing over a space of possible metrics would be infeasible, but access to a label aware distance metric is still desirable. We present a simple, and computationally inexpensive approach for estimating a well motivated metric that relies only on gradient estimates, discussing theoretical and experimental results. In the final part, we address representational issues, considering group equivariant convolutional neural networks (GCNNs). Equivariance to symmetry transformations is explicitly encoded in GCNNs; a classical CNN being the simplest example. In particular, we present a SO(3)-equivariant neural network architecture for spherical data, that operates entirely in Fourier space, while also providing a formalism for the design of fully Fourier neural networks that are equivariant to the action of any continuous compact group.