Goto

Collaborating Authors

 Regression


Determination of toxic comments and unintended model bias minimization using Deep learning approach

arXiv.org Artificial Intelligence

Online conversations can be toxic and subjected to threats, abuse, or harassment. To identify toxic text comments, several deep learning and machine learning models have been proposed throughout the years. However, recent studies demonstrate that because of the imbalances in the training data, some models are more likely to show unintended biases including gender bias and identity bias. In this research, our aim is to detect toxic comment and reduce the unintended bias concerning identity features such as race, gender, sex, religion by fine-tuning an attention based model called BERT(Bidirectional Encoder Representation from Transformers). We apply weighted loss to address the issue of unbalanced data and compare the performance of a fine-tuned BERT model with a traditional Logistic Regression model in terms of classification and bias minimization. The Logistic Regression model with the TFIDF vectorizer achieve 57.1% accuracy, and fine-tuned BERT model's accuracy is 89%. Code is available at https://github.com/zim10/Determine_Toxic_comment_and_identity_bias.git


Joint processing of linguistic properties in brains and language models

arXiv.org Artificial Intelligence

Language models have been shown to be very effective in predicting brain recordings of subjects experiencing complex language stimuli. For a deeper understanding of this alignment, it is important to understand the correspondence between the detailed processing of linguistic information by the human brain versus language models. We investigate this correspondence via a direct approach, in which we eliminate information related to specific linguistic properties in the language model representations and observe how this intervention affects the alignment with fMRI brain recordings obtained while participants listened to a story. We investigate a range of linguistic properties (surface, syntactic, and semantic) and find that the elimination of each one results in a significant decrease in brain alignment. Specifically, we find that syntactic properties (i.e. Top Constituents and Tree Depth) have the largest effect on the trend of brain alignment across model layers. These findings provide clear evidence for the role of specific linguistic information in the alignment between brain and language models, and open new avenues for mapping the joint information processing in both systems.


Meta-learning of semi-supervised learning from tasks with heterogeneous attribute spaces

arXiv.org Machine Learning

We propose a meta-learning method for semi-supervised learning that learns from multiple tasks with heterogeneous attribute spaces. The existing semi-supervised meta-learning methods assume that all tasks share the same attribute space, which prevents us from learning with a wide variety of tasks. With the proposed method, the expected test performance on tasks with a small amount of labeled data is improved with unlabeled data as well as data in various tasks, where the attribute spaces are different among tasks. The proposed method embeds labeled and unlabeled data simultaneously in a task-specific space using a neural network, and the unlabeled data's labels are estimated by adapting classification or regression models in the embedding space. For the neural network, we develop variable-feature self-attention layers, which enable us to find embeddings of data with different attribute spaces with a single neural network by considering interactions among examples, attributes, and labels. Our experiments on classification and regression datasets with heterogeneous attribute spaces demonstrate that our proposed method outperforms the existing meta-learning and semi-supervised learning methods.


Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients

arXiv.org Machine Learning

We present an unbiased method for Bayesian posterior means based on kinetic Langevin dynamics that combines advanced splitting methods with enhanced gradient approximations. Our approach avoids Metropolis correction by coupling Markov chains at different discretization levels in a multilevel Monte Carlo approach. Theoretical analysis demonstrates that our proposed estimator is unbiased, attains finite variance, and satisfies a central limit theorem. It can achieve accuracy $\epsilon>0$ for estimating expectations of Lipschitz functions in $d$ dimensions with $\mathcal{O}(d^{1/4}\epsilon^{-2})$ expected gradient evaluations, without assuming warm start. We exhibit similar bounds using both approximate and stochastic gradients, and our method's computational cost is shown to scale logarithmically with the size of the dataset. The proposed method is tested using a multinomial regression problem on the MNIST dataset and a Poisson regression model for soccer scores. Experiments indicate that the number of gradient evaluations per effective sample is independent of dimension, even when using inexact gradients. For product distributions, we give dimension-independent variance bounds. Our results demonstrate that the unbiased algorithm we present can be much more efficient than the ``gold-standard" randomized Hamiltonian Monte Carlo.


How Reliable is Your Regression Model's Uncertainty Under Real-World Distribution Shifts?

arXiv.org Artificial Intelligence

Many important computer vision applications are naturally formulated as regression problems. Within medical imaging, accurate regression models have the potential to automate various tasks, helping to lower costs and improve patient outcomes. Such safety-critical deployment does however require reliable estimation of model uncertainty, also under the wide variety of distribution shifts that might be encountered in practice. Motivated by this, we set out to investigate the reliability of regression uncertainty estimation methods under various real-world distribution shifts. To that end, we propose an extensive benchmark of 8 image-based regression datasets with different types of challenging distribution shifts. We then employ our benchmark to evaluate many of the most common uncertainty estimation methods, as well as two state-of-the-art uncertainty scores from the task of out-of-distribution detection. We find that while methods are well calibrated when there is no distribution shift, they all become highly overconfident on many of the benchmark datasets. This uncovers important limitations of current uncertainty estimation methods, and the proposed benchmark therefore serves as a challenge to the research community. We hope that our benchmark will spur more work on how to develop truly reliable regression uncertainty estimation methods. Code is available at https://github.com/fregu856/regression_uncertainty.


Adaptive Linear Estimating Equations

arXiv.org Machine Learning

Sequential data collection has emerged as a widely adopted technique for enhancing the efficiency of data gathering processes. Despite its advantages, such data collection mechanism often introduces complexities to the statistical inference procedure. For instance, the ordinary least squares (OLS) estimator in an adaptive linear regression model can exhibit non-normal asymptotic behavior, posing challenges for accurate inference and interpretation. In this paper, we propose a general method for constructing debiased estimator which remedies this issue. It makes use of the idea of adaptive linear estimating equations, and we establish theoretical guarantees of asymptotic normality, supplemented by discussions on achieving near-optimal asymptotic variance. A salient feature of our estimator is that in the context of multi-armed bandits, our estimator retains the non-asymptotic performance of the least squares estimator while obtaining asymptotic normality property. Consequently, this work helps connect two fruitful paradigms of adaptive inference: a) non-asymptotic inference using concentration inequalities and b) asymptotic inference via asymptotic normality.


Sparse Interaction Additive Networks via Feature Interaction Detection and Sparse Selection

arXiv.org Machine Learning

There is currently a large gap in performance between the statistically rigorous methods like linear regression or additive splines and the powerful deep methods using neural networks. Previous works attempting to close this gap have failed to fully investigate the exponentially growing number of feature combinations which deep networks consider automatically during training. In this work, we develop a tractable selection algorithm to efficiently identify the necessary feature combinations by leveraging techniques in feature interaction detection. Our proposed Sparse Interaction Additive Networks (SIAN) construct a bridge from these simple and interpretable models to fully connected neural networks. SIAN achieves competitive performance against state-of-the-art methods across multiple large-scale tabular datasets and consistently finds an optimal tradeoff between the modeling capacity of neural networks and the generalizability of simpler methods.


Dimensions of Online Conflict: Towards Modeling Agonism

arXiv.org Artificial Intelligence

Agonism plays a vital role in democratic dialogue by fostering diverse perspectives and robust discussions. Within the realm of online conflict there is another type: hateful antagonism, which undermines constructive dialogue. Detecting conflict online is central to platform moderation and monetization. It is also vital for democratic dialogue, but only when it takes the form of agonism. To model these two types of conflict, we collected Twitter conversations related to trending controversial topics. We introduce a comprehensive annotation schema for labelling different dimensions of conflict in the conversations, such as the source of conflict, the target, and the rhetorical strategies deployed. Using this schema, we annotated approximately 4,000 conversations with multiple labels. We then trained both logistic regression and transformer-based models on the dataset, incorporating context from the conversation, including the number of participants and the structure of the interactions. Results show that contextual labels are helpful in identifying conflict and make the models robust to variations in topic. Our research contributes a conceptualization of different dimensions of conflict, a richly annotated dataset, and promising results that can contribute to content moderation.


Brain Networks and Intelligence: A Graph Neural Network Based Approach to Resting State fMRI Data

arXiv.org Artificial Intelligence

Resting-state functional magnetic resonance imaging (rsfMRI) is a powerful tool for investigating the relationship between brain function and cognitive processes as it allows for the functional organization of the brain to be captured without relying on a specific task or stimuli. In this paper, we present a novel modeling architecture called BrainRGIN for predicting intelligence (fluid, crystallized, and total intelligence) using graph neural networks on rsfMRI derived static functional network connectivity matrices. Extending from the existing graph convolution networks, our approach incorporates a clustering-based embedding and graph isomorphism network in the graph convolutional layer to reflect the nature of the brain sub-network organization and efficient network expression, in combination with TopK pooling and attention-based readout functions. We evaluated our proposed architecture on a large dataset, specifically the Adolescent Brain Cognitive Development Dataset, and demonstrated its effectiveness in predicting individual differences in intelligence. Our model achieved lower mean squared errors and higher correlation scores than existing relevant graph architectures and other traditional machine learning models for all of the intelligence prediction tasks. The middle frontal gyrus exhibited a significant contribution to both fluid and crystallized intelligence, suggesting their pivotal role in these cognitive processes. Total composite scores identified a diverse set of brain regions to be relevant which underscores the complex nature of total intelligence.


Feature selection and regression methods for stock price prediction using technical indicators

arXiv.org Artificial Intelligence

Due to the influence of many factors, including technical indicators on stock price prediction, feature selection is important to choose the best indicators. This study uses technical indicators and features selection and regression methods to solve the problem of closing the stock market price. The aim of this research is to predict the stock market price with the least error. By the proposed method, the data created by the 3-day time window were converted to the appropriate input for regression methods. In this paper, 10 regressor and 123 technical indicators have been examined on data of the last 13 years of Apple Company. The results have been investigated by 5 error-based evaluation criteria. Based on results of the proposed method, MLPSF has 56/47% better performance than MLP. Also, SVRSF has 67/42% improved compared to SVR. LRSF was 76.7 % improved compared to LR. The RISF method also improved 72.82 % of Ridge regression. The DTRSB method had 24.23 % improvement over DTR. KNNSB had 15.52 % improvement over KNN regression. RFSB had a 6 % improvement over RF. GBRSF also improved at 7% over GBR. Finally, ADASF and ADASB also had a 4% improvement over the ADA regression. Also, Ridge and LinearRegression had the best results for stock price prediction. Based on results, the best indicators to predict stock price are: the Squeeze_pro, Percentage Price Oscillator, Thermo, Decay, Archer On-Balance Volume, Bollinger Bands, Squeeze and Ichimoku indicator. According to the results, the use of suitable combination of suggested indicators along with regression methods has resulted in high accuracy in predicting the closing price.