Regression
How are the people in the photos judged? Analysis of brain activity when assessing levels of trust and attractiveness
Bartosik, Bernadetta, Wojcik, Grzegorz M., Kawiak, Andrzej, Brzezicka, Aneta
Trust is the foundation of every area of life. Without it, it is difficult to build lasting relationships. Unfortunately, in recent years, trust has been severely damaged by the spread of fake news and disinformation, which has become a serious social problem. In addition to trust, the factor influencing interpersonal relationships is perceived attractiveness, which is currently created to a large extent by digital media. Understanding the principles of judging others can be helpful in fighting prejudice and rebuilding trust in society. One way to learn about people's choices is to record their brain activity as they make choices. The article presents an experiment in which the faces of different people were presented, and the participants' task was to assess how much they can trust a given person and how attractive they are. During the study, the EEG signal was recorded, which was used to build models of logistic regression classifiers. In addition, the most active areas of the brain that participate in the assessment of trust and attractiveness of the face were indicated.
Performance Evaluation and Comparison of a New Regression Algorithm
Gooljar, Sabina, Manohar, Kris, Hosein, Patrick
In recent years, Machine Learning algorithms, in particular supervised learning techniques, have been shown to be very effective in solving regression problems. We compare the performance of a newly proposed regression algorithm against four conventional machine learning algorithms namely, Decision Trees, Random Forest, k-Nearest Neighbours and XG Boost. The proposed algorithm was presented in detail in a previous paper but detailed comparisons were not included. We do an in-depth comparison, using the Mean Absolute Error (MAE) as the performance metric, on a diverse set of datasets to illustrate the great potential and robustness of the proposed approach. The reader is free to replicate our results since we have provided the source code in a GitHub repository while the datasets are publicly available.
On Consistency and Asymptotic Normality of Least Absolute Deviation Estimators for 2-dimensional Sinusoidal Model
Roy, Saptarshi, Mitra, Amit, Archak, N K
Estimation of the parameters of a 2-dimensional sinusoidal model is a fundamental problem in digital signal processing and time series analysis. In this paper, we propose a robust least absolute deviation (LAD) estimators for parameter estimation. The proposed methodology provides a robust alternative to non-robust estimation techniques like the least squares estimators, in situations where outliers are present in the data or in the presence of heavy tailed noise. We study important asymptotic properties of the LAD estimators and establish the strong consistency and asymptotic normality of the LAD estimators of the signal parameters of a 2-dimensional sinusoidal model. We further illustrate the advantage of using LAD estimators over least squares estimators through extensive simulation studies. Data analysis of a 2-dimensional texture data indicates practical applicability of the proposed LAD approach.
Predicting Wireless Channel Quality by means of Moving Averages and Regression Models
Formis, Gabriele, Scanzio, Stefano, Cena, Gianluca, Valenzano, Adriano
The ability to reliably predict the future quality of a wireless channel, as seen by the media access control layer, is a key enabler to improve performance of future industrial networks that do not rely on wires. Knowing in advance how much channel behavior may change can speed up procedures for adaptively selecting the best channel, making the network more deterministic, reliable, and less energy-hungry, possibly improving device roaming capabilities at the same time. To this aim, popular approaches based on moving averages and regression were compared, using multiple key performance indicators, on data captured from a real Wi-Fi setup. Moreover, a simple technique based on a linear combination of outcomes from different techniques was presented and analyzed, to further reduce the prediction error, and some considerations about lower bounds on achievable errors have been reported. We found that the best model is the exponential moving average, which managed to predict the frame delivery ratio with a 2.10\% average error and, at the same time, has lower computational complexity and memory consumption than the other models we analyzed.
Nonparametric regression using over-parameterized shallow ReLU neural networks
It is shown that over-parameterized neural networks can achieve minimax optimal rates of convergence (up to logarithmic factors) for learning functions from certain smooth function classes, if the weights are suitably constrained or regularized. Specifically, we consider the nonparametric regression of estimating an unknown $d$-variate function by using shallow ReLU neural networks. It is assumed that the regression function is from the H\"older space with smoothness $\alpha<(d+3)/2$ or a variation space corresponding to shallow neural networks, which can be viewed as an infinitely wide neural network. In this setting, we prove that least squares estimators based on shallow neural networks with certain norm constraints on the weights are minimax optimal, if the network width is sufficiently large. As a byproduct, we derive a new size-independent bound for the local Rademacher complexity of shallow ReLU neural networks, which may be of independent interest.
Specifying and Solving Robust Empirical Risk Minimization Problems Using CVXPY
Luxenberg, Eric, Malik, Dhruv, Li, Yuanzhi, Singh, Aarti, Boyd, Stephen
We consider robust empirical risk minimization (ERM), where model parameters are chosen to minimize the worst-case empirical loss when each data point varies over a given convex uncertainty set. In some simple cases, such problems can be expressed in an analytical form. In general the problem can be made tractable via dualization, which turns a min-max problem into a min-min problem. Dualization requires expertise and is tedious and error-prone. We demonstrate how CVXPY can be used to automate this dualization procedure in a user-friendly manner. Our framework allows practitioners to specify and solve robust ERM problems with a general class of convex losses, capturing many standard regression and classification problems. Users can easily specify any complex uncertainty set that is representable via disciplined convex programming (DCP) constraints.
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding
Oprescu, Miruna, Dorn, Jacob, Ghoummaid, Marah, Jesson, Andrew, Kallus, Nathan, Shalit, Uri
Estimating heterogeneous treatment effects from observational data is a crucial task across many fields, helping policy and decision-makers take better actions. There has been recent progress on robust and efficient methods for estimating the conditional average treatment effect (CATE) function, but these methods often do not take into account the risk of hidden confounding, which could arbitrarily and unknowingly bias any causal estimate based on observational data. We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on the level of hidden confounding. We derive the B-Learner by adapting recent results for sharp and valid bounds of the average treatment effect (Dorn et al., 2021) into the framework given by Kallus & Oprescu (2023) for robust and model-agnostic learning of conditional distributional treatment effects. The B-Learner can use any function estimator such as random forests and deep neural networks, and we prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods. Semi-synthetic experimental comparisons validate the theoretical findings, and we use real-world data to demonstrate how the method might be used in practice.
Shapley Value on Probabilistic Classifiers
Li, Xiang, Xia, Haocheng, Liu, Jinfei
Data valuation has become an increasingly significant discipline in data science due to the economic value of data. In the context of machine learning (ML), data valuation methods aim to equitably measure the contribution of each data point to the utility of an ML model. One prevalent method is Shapley value, which helps identify data points that are beneficial or detrimental to an ML model. However, traditional Shapley-based data valuation methods may not effectively distinguish between beneficial and detrimental training data points for probabilistic classifiers. In this paper, we propose Probabilistic Shapley (P-Shapley) value by constructing a probability-wise utility function that leverages the predicted class probabilities of probabilistic classifiers rather than binarized prediction results in the traditional Shapley value. We also offer several activation functions for confidence calibration to effectively quantify the marginal contribution of each data point to the probabilistic classifiers. Extensive experiments on four real-world datasets demonstrate the effectiveness of our proposed P-Shapley value in evaluating the importance of data for building a high-usability and trustworthy ML model.
Correcting for Selection Bias and Missing Response in Regression using Privileged Information
Boeken, Philip, de Kroon, Noud, de Jong, Mathijs, Mooij, Joris M., Zoeter, Onno
When estimating a regression model, we might have data where some labels are missing, or our data might be biased by a selection mechanism. When the response or selection mechanism is ignorable (i.e., independent of the response variable given the features) one can use off-the-shelf regression methods; in the nonignorable case one typically has to adjust for bias. We observe that privileged information (i.e. information that is only available during training) might render a nonignorable selection mechanism ignorable, and we refer to this scenario as Privilegedly Missing at Random (PMAR). We propose a novel imputation-based regression method, named repeated regression, that is suitable for PMAR. We also consider an importance weighted regression method, and a doubly robust combination of the two. The proposed methods are easy to implement with most popular out-of-the-box regression algorithms. We empirically assess the performance of the proposed methods with extensive simulated experiments and on a synthetically augmented real-world dataset. We conclude that repeated regression can appropriately correct for bias, and can have considerable advantage over weighted regression, especially when extrapolating to regions of the feature space where response is never observed.
Izindaba-Tindzaba: Machine learning news categorisation for Long and Short Text for isiZulu and Siswati
Madodonga, Andani, Marivate, Vukosi, Adendorff, Matthew
Local/Native South African languages are classified as low-resource languages. As such, it is essential to build the resources for these languages so that they can benefit from advances in the field of natural language processing. In this work, the focus was to create annotated news datasets for the isiZulu and Siswati native languages based on news topic classification tasks and present the findings from these baseline classification models. Due to the shortage of data for these native South African languages, the datasets that were created were augmented and oversampled to increase data size and overcome class classification imbalance. In total, four different classification models were used namely Logistic regression, Naive bayes, XGBoost and LSTM. These models were trained on three different word embeddings namely Bag-Of-Words, TFIDF and Word2vec. The results of this study showed that XGBoost, Logistic Regression and LSTM, trained from Word2vec performed better than the other combinations.