Goto

Collaborating Authors

 Regression


Towards Machine Learning in Pharo: Visualizing Linear Regression

#artificialintelligence

This is a small tutorial on how to estimate prices of houses in Pharo using linear regression model from PolyMath. We will then visualize the data points together with the regression line using the new charting capabilities of Roassal3. The main purpose of this blog post is to demonstrate the new charting functionality of Roassal3 that were introduced yesterday. The visualization that we will build is not very pretty, but it will give you a taste of the amazing things that we will be able to do in the near future. Pharo is a pure object-oriented programming language and a powerful environment, focused on simplicity and immediate feedback (think IDE and OS rolled into one).


A regression algorithm for accelerated lattice QCD that exploits sparse inference on the D-Wave quantum annealer

arXiv.org Machine Learning

We propose a regression algorithm that utilizes a learned dictionary optimized for sparse inference on D-Wave quantum annealer. In this regression algorithm, we concatenate the independent and dependent variables as an combined vector, and encode the high-order correlations between them into a dictionary optimized for sparse reconstruction. On a test dataset, the dependent variable is initialized to its average value and then a sparse reconstruction of the combined vector is obtained in which the dependent variable is typically shifted closer to its true value, as in a standard inpainting or denoising task. Here, a quantum annealer, which can presumably exploit a fully entangled initial state to better explore the complex energy landscape, is used to solve the highly non-convex sparse coding optimization problem. The regression algorithm is demonstrated for a lattice quantum chromodynamics simulation data using a D-Wave 2000Q quantum annealer and good prediction performance is achieved. The regression test is performed using six different values for the number of fully connected logical qubits, between 20 and 64, the latter being the maximum that can be embedded on the D-Wave 2000Q. The scaling results indicate that a larger number of qubits gives better prediction accuracy, the best performance being comparable to the best classical regression algorithms reported so far.


Uncertainty Quantification in Ensembles of Honest Regression Trees using Generalized Fiducial Inference

arXiv.org Machine Learning

Due to their accuracies, methods based on ensembles of regression trees are a popular approach for making predictions. Some common examples include Bayesian additive regression trees, boosting and random forests. This paper focuses on honest random forests, which add honesty to the original form of random forests and are proved to have better statistical properties. The main contribution is a new method that quantifies the uncertainties of the estimates and predictions produced by honest random forests. The proposed method is based on the generalized fiducial methodology, and provides a fiducial density function that measures how likely each single honest tree is the true model. With such a density function, estimates and predictions, as well as their confidence/prediction intervals, can be obtained. The promising empirical properties of the proposed method are demonstrated by numerical comparisons with several state-of-the-art methods, and by applications to a few real data sets. Lastly, the proposed method is theoretically backed up by a strong asymptotic guarantee.


Neural networks for option pricing and hedging: a literature review

arXiv.org Machine Learning

This work provides a review of this literature. The motivation for this summary arose from our companion paper Ruf and W ang [2019]. There we continue th e discussions of this note; in particular, of potentially problematic data leakage when training ANNs to historic financial data. This paper is organised in the following way. Section 2 featu res Table 1, a summary of the literature that concerns the use of ANNs for nonparametric pricing (and hedging) of options. Section 3 provides a list of recommended papers from Table 1. Section 4 provides a n overview of related work where ANNs are applied in the context of option pricing and hedging, but not necessarily as nonparametric estimation tools. Section 5 briefly discusses various regularisation techniq ues used in the reviewed literature.


Regression via Arbitrary Quantile Modeling

arXiv.org Artificial Intelligence

In the regression problem, L1 and L2 are the most commonly used loss functions, which produce mean predictions with different biases. However, the predictions are neither robust nor adequate enough since they only capture a few conditional distributions instead of the whole distribution, especially for small datasets. To address this problem, we proposed arbitrary quantile modeling to regulate the prediction, which achieved better performance compared to traditional loss functions. More specifically, a new distribution regression method, Deep Distribution Regression (DDR), is proposed to estimate arbitrary quantiles of the response variable. Our DDR method consists of two models: a Q model, which predicts the corresponding value for arbitrary quantile, and an F model, which predicts the corresponding quantile for arbitrary value. Furthermore, the duality between Q and F models enables us to design a novel loss function for joint training and perform a dual inference mechanism. Our experiments demonstrate that our DDR-joint and DDR-disjoint methods outperform previous methods such as AdaBoost, random forest, LightGBM, and neural networks both in terms of mean and quantile prediction.


Explainable Ordinal Factorization Model: Deciphering the Effects of Attributes by Piece-wise Linear Approximation

arXiv.org Machine Learning

Ordinal regression predicts the objects' labels that exhibit a natural ordering, which is important to many managerial problems such as credit scoring and clinical diagnosis. In these problems, the ability to explain how the attributes affect the prediction is critical to users. However, most, if not all, existing ordinal regression models simplify such explanation in the form of constant coefficients for the main and interaction effects of individual attributes. Such explanation cannot characterize the contributions of attributes at different value scales. To address this challenge, we propose a new explainable ordinal regression model, namely, the Explainable Ordinal Factorization Model (XOFM). XOFM uses the piece-wise linear functions to approximate the actual contributions of individual attributes and their interactions. Moreover, XOFM introduces a novel ordinal transformation process to assign each object the probabilities of belonging to multiple relevant classes, instead of fixing boundaries to differentiate classes. XOFM is based on the Factorization Machines to handle the potential sparsity problem as a result of discretizing the attribute scales. Comprehensive experiments with benchmark datasets and baseline models demonstrate that the proposed XOFM exhibits superior explainability and leads to state-of-the-art prediction accuracy.


A Model of Double Descent for High-dimensional Binary Linear Classification

arXiv.org Machine Learning

We consider a model for logistic regression where only a subset of features of size $p$ is used for training a linear classifier over $n$ training samples. The classifier is obtained by running gradient-descent (GD) on the logistic-loss. For this model, we investigate the dependence of the generalization error on the overparameterization ratio $\kappa=p/n$. First, building on known deterministic results on convergence properties of the GD, we uncover a phase-transition phenomenon for the case of Gaussian regressors: the generalization error of GD is the same as that of the maximum-likelihood (ML) solution when $\kappa<\kappa_\star$, and that of the max-margin (SVM) solution when $\kappa>\kappa_\star$. Next, using the convex Gaussian min-max theorem (CGMT), we sharply characterize the performance of both the ML and SVM solutions. Combining these results, we obtain curves that explicitly characterize the generalization error of GD for varying values of $\kappa$. The numerical results validate the theoretical predictions and unveil double-descent phenomena that complement similar recent observations in linear regression settings.


Fairness-Aware Neural R\'eyni Minimization for Continuous Features

arXiv.org Artificial Intelligence

The past few years have seen a dramatic rise of academic and societal interest in fair machine learning. While plenty of fair algorithms have been proposed recently to tackle this challenge for discrete variables, only a few ideas exist for continuous ones. The objective in this paper is to ensure some independence level between the outputs of regression models and any given continuous sensitive variables. For this purpose, we use the Hirschfeld-Gebelein-R enyi (HGR) maximal correlation coefficient as a fairness metric. We propose two approaches to minimize the HGR coefficient. First, by reducing an upper bound of the HGR with a neural network estimation of the ฯ‡ 2 divergence. The idea is to predict the output Y while minimizing the ability of an adversarial neural network to find the estimated transformations which are required to predict the HGR coefficient. We empirically assess and compare our approaches and demonstrate significant improvements on previously presented work in the field. 1 Introduction The use of machine learning algorithms in our day-to-day life has become ubiquitous. However, when trained on biased data, those algorithms are prone to learn, perpetuate or even reinforce these biases [6]. Because many applications have far-reaching consequences (credit rating, insurance pricing, recidivism score, etc.), there is an increasing concern in society that the use of machine learning models could reproduce discrimination based on sensitive attributes such as gender, race, age, weight, or other.


Achieving Differential Privacy in Vertically Partitioned Multiparty Learning

arXiv.org Machine Learning

Preserving differential privacy has been well studied under centralized setting. However, it's very challenging to preserve differential privacy under multiparty setting, especially for the vertically partitioned case. In this work, we propose a new framework for differential privacy preserving multiparty learning in the vertically partitioned setting. Our core idea is based on the functional mechanism that achieves differential privacy of the released model by adding noise to the objective function. We show the server can simply dissect the objective function into single-party and cross-party sub-functions, and allocate computation and perturbation of their polynomial coefficients to local parties. Our method needs only one round of noise addition and secure aggregation. The released model in our framework achieves the same utility as applying the functional mechanism in the centralized setting. Evaluation on real-world and synthetic datasets for linear and logistic regressions shows the effectiveness of our proposed method.