Goto

Collaborating Authors

 Performance Analysis


Distributed Stochastic Algorithms for High-rate Streaming Principal Component Analysis

arXiv.org Machine Learning

This paper considers the problem of estimating the principal eigenvector of a covariance matrix from independent and identically distributed data samples in streaming settings. The streaming rate of data in many contemporary applications can be high enough that a single processor cannot finish an iteration of existing methods for eigenvector estimation before a new sample arrives. This paper formulates and analyzes a distributed variant of the classical Krasulina's method (D-Krasulina) that can keep up with the high streaming rate of data by distributing the computational load across multiple processing nodes. The analysis shows that---under appropriate conditions---D-Krasulina converges to the principal eigenvector in an order-wise optimal manner; i.e., after receiving $M$ samples across all nodes, its estimation error can be $O(1/M)$. In order to reduce the network communication overhead, the paper also develops and analyzes a mini-batch extension of D-Krasulina, which is termed DM-Krasulina. The analysis of DM-Krasulina shows that it can also achieve order-optimal estimation error rates under appropriate conditions, even when some samples have to be discarded within the network due to communication latency. Finally, experiments are performed over synthetic and real-world data to validate the convergence behaviors of D-Krasulina and DM-Krasulina in high-rate streaming settings.


The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling

arXiv.org Artificial Intelligence

In this paper, we propose a new metric to measure goodness-of-fit for classifiers, the Real World Cost function. This metric factors in information about a real world problem, such as financial impact, that other measures like accuracy or F1 do not. This metric is also more directly interpretable for users. To optimize for this metric, we introduce the Real-World- Weight Crossentropy loss function, in both binary and single-label classification variants. Both variants allow direct input of real world costs as weights. For single-label, multicategory classification, our loss function also allows direct penalization of probabilistic false positives, weighted by label, during the training of a machine learning model. We compare the design of our loss function to the binary crossentropy and categorical crossentropy functions, as well as their weighted variants, to discuss the potential for improvement in handling a variety of known shortcomings of machine learning, ranging from imbalanced classes to medical diagnostic error to reinforcement of social bias. We create scenarios that emulate those issues using the MNIST data set and demonstrate empirical results of our new loss function. Finally, we sketch a proof of this function based on Maximum Likelihood Estimation and discuss future directions.


How to Calculate Precision, Recall, and F-Measure for Imbalanced Classification

#artificialintelligence

Classification accuracy is the total number of correct predictions divided by the total number of predictions made for a dataset. As a performance measure, accuracy is inappropriate for imbalanced classification problems. The main reason is that the overwhelming number of examples from the majority class (or classes) will overwhelm the number of examples in the minority class, meaning that even unskillful models can achieve accuracy scores of 90 percent, or 99 percent, depending on how severe the class imbalance happens to be. An alternative to using classification accuracy is to use precision and recall metrics. In this tutorial, you will discover how to calculate and develop an intuition for precision and recall for imbalanced classification.


Ensemble emotion recognizing with multiple modal physiological signals

arXiv.org Machine Learning

Physiological signals that provide the objective repression of human affective states are attracted increasing attention in the emotion recognition field. However, the single signal is difficult to obtain completely and accurately description for emotion. Multiple physiological signals fusing models, building the uniform classification model by means of consistent and complementary information from different emotions to improve recognition performance. Original fusing models usually choose the particular classification method to recognition, which is ignoring different distribution of multiple signals. Aiming above problems, in this work, we propose an emotion classification model through multiple modal physiological signals for different emotions. Features are extracted from EEG, EMG, EOG signals for characterizing emotional state on valence and arousal levels. For characterization, four bands filtering theta, beta, alpha, gamma for signal preprocessing are adopted and three Hjorth parameters are computing as features. To improve classification performance, an ensemble classifier is built. Experiments are conducted on the benchmark DEAP datasets. For the two-class task, the best result on arousal is 94.42\%, the best result on valence is 94.02\%, respectively. For the four-class task, the highest average classification accuracy is 90.74, and it shows good stability. The influence of different peripheral physiological signals for results is also analyzed in this paper.


Direction Concentration Learning: Enhancing Congruency in Machine Learning

arXiv.org Machine Learning

One of the well-known challenges in computer vision tasks is the visual diversity of images, which could result in an agreement or disagreement between the learned knowledge and the visual content exhibited by the current observation. In this work, we first define such an agreement in a concepts learning process as congruency. Formally, given a particular task and sufficiently large dataset, the congruency issue occurs in the learning process whereby the task-specific semantics in the training data are highly varying. We propose a Direction Concentration Learning (DCL) method to improve congruency in the learning process, where enhancing congruency influences the convergence path to be less circuitous. The experimental results show that the proposed DCL method generalizes to state-of-the-art models and optimizers, as well as improves the performances of saliency prediction task, continual learning task, and classification task. Moreover, it helps mitigate the catastrophic forgetting problem in the continual learning task. The code is publicly available at https://github.com/luoyan407/congruency.


Leveraging Semi-Supervised Learning for Fairness using Neural Networks

arXiv.org Artificial Intelligence

--There has been a growing concern about the fairness of decision-making systems based on machine learning. The shortage of labeled data has been always a challenging problem facing machine learning based systems. In such scenarios, semi-supervised learning has shown to be an effective way of exploiting unlabeled data to improve upon the performance of model. Notably, unlabeled data do not contain label information which itself can be a significant source of bias in training machine learning systems. This inspired us to tackle the challenge of fairness by formulating the problem in a semi-supervised framework. In this paper, we propose a semi-supervised algorithm using neural networks benefiting from unlabeled data to not just improve the performance but also improve the fairness of the decision-making process. The proposed model, called SSFair, exploits the information in the unlabeled data to mitigate the bias in the training data.


Deep Learning-Based Intrusion Detection System for Advanced Metering Infrastructure

arXiv.org Machine Learning

Smart grid is an alternative solution of the conventional power grid which harnesses the power of the information technology to save the energy and meet today's environment requirements. Due to the inherent vulnerabilities in the information technology, the smart grid is exposed to a wide variety of threats that could be translated into cyber-attacks. In this paper, we develop a deep learning-based intrusion detection system to defend against cyber-attacks in the advanced metering infrastructure network. The proposed machine learning approach is trained and tested extensively on an empirical industrial dataset which is composed of several attack categories including the scanning, buffer overflow, and denial of service attacks. Then, an experimental comparison in terms of detection accuracy is conducted to evaluate the performance of the proposed approach with Naive Bayes, Support Vector Machine, and Random Forest. The obtained results suggest that the proposed approaches produce optimal results comparing to the other algorithms. Finally, we propose a network architecture to deploy the proposed anomaly-based intrusion detection system across the Advanced Metering Infrastructure network. In addition, we propose a network security architecture composed of two types of Intrusion detection system types, Host and Network-based, deployed across the Advanced Metering Infrastructure network to inspect the traffic and detect the malicious one at all the levels.


Infra-slow brain dynamics as a marker for cognitive function and decline

Neural Information Processing Systems

Functional magnetic resonance imaging (fMRI) enables measuring human brain activity, in vivo. Yet, the fMRI hemodynamic response unfolds over very slow timescales ( 0.1-1 Hz), orders of magnitude slower than millisecond timescales of neural spiking. It is unclear, therefore, if slow dynamics as measured with fMRI are relevant for cognitive function. We investigated this question with a novel application of Gaussian Process Factor Analysis (GPFA) and machine learning to fMRI data. We analyzed slowly sampled (1.4 Hz) fMRI data from 1000 healthy human participants (Human Connectome Project database), and applied GPFA to reduce dimensionality and extract smooth latent dynamics. GPFA dimensions with slow ( 1 Hz) characteristic timescales identified, with high accuracy ( 95%), the specific task that each subject was performing inside the fMRI scanner. Moreover, functional connectivity between slow GPFA latents accurately predicted interindividual differences in behavioral scores across a range of cognitive tasks. Finally, infra-slow ( 0.1 Hz) latent dynamics predicted CDR (Clinical Dementia Rating) scores of individual patients, and identified patients with mild cognitive impairment (MCI) who would progress to develop Alzheimer's dementia (AD). Slow and infra-slow brain dynamics may be relevant for understanding the neural basis of cognitive function, in health and disease.


Modeling and Counteracting Exposure Bias in Recommender Systems

arXiv.org Machine Learning

What we discover and see online, and consequently our opinions and decisions, are becoming increasingly affected by automated machine learned predictions. Similarly, the predictive accuracy of learning machines heavily depends on the feedback data that we provide them. This mutual influence can lead to closed-loop interactions that may cause unknown biases which can be exacerbated after several iterations of machine learning predictions and user feedback. Machine-caused biases risk leading to undesirable social effects ranging from polarization to unfairness and filter bubbles. In this paper, we study the bias inherent in widely used recommendation strategies such as matrix factorization. Then we model the exposure that is borne from the interaction between the user and the recommender system and propose new debiasing strategies for these systems. Finally, we try to mitigate the recommendation system bias by engineering solutions for several state of the art recommender system models. Our results show that recommender systems are biased and depend on the prior exposure of the user. We also show that the studied bias iteratively decreases diversity in the output recommendations. Our debiasing method demonstrates the need for alternative recommendation strategies that take into account the exposure process in order to reduce bias. Our research findings show the importance of understanding the nature of and dealing with bias in machine learning models such as recommender systems that interact directly with humans, and are thus causing an increasing influence on human discovery and decision making


A Performance Comparison of Data Mining Algorithms Based Intrusion Detection System for Smart Grid

arXiv.org Machine Learning

Smart grid is an emerging and promising technology. It uses the power of information technologies to deliver intelligently the electrical power to customers, and it allows the integration of the green technology to meet the environmental requirements. Unfortunately, information technologies have its inherent vulnerabilities and weaknesses that expose the smart grid to a wide variety of security risks. The Intrusion detection system (IDS) plays an important role in securing smart grid networks and detecting malicious activity, yet it suffers from several limitations. Many research papers have been published to address these issues using several algorithms and techniques. Therefore, a detailed comparison between these algorithms is needed. This paper presents an overview of four data mining algorithms used by IDS in Smart Grid. An evaluation of performance of these algorithms is conducted based on several metrics including the probability of detection, probability of false alarm, probability of miss detection, efficiency, and processing time. Results show that Random Forest outperforms the other three algorithms in detecting attacks with higher probability of detection, lower probability of false alarm, lower probability of miss detection, and higher accuracy.