Performance Analysis
An electronic neuromorphic system for real-time detection of High Frequency Oscillations (HFOs) in intracranial EEG
Sharifhazileh, Mohammadali, Burelo, Karla, Sarnthein, Johannes, Indiveri, Giacomo
In this work, we present a neuromorphic system that combines for the first time a neural recording headstage with a signal-to-spike conversion circuit and a multi-core spiking neural network (SNN) architecture on the same die for recording, processing, and detecting High Frequency Oscillations (HFO), which are biomarkers for the epileptogenic zone. The device was fabricated using a standard 0.18$\mu$m CMOS technology node and has a total area of 99mm$^{2}$. We demonstrate its application to HFO detection in the iEEG recorded from 9 patients with temporal lobe epilepsy who subsequently underwent epilepsy surgery. The total average power consumption of the chip during the detection task was 614.3$\mu$W. We show how the neuromorphic system can reliably detect HFOs: the system predicts postsurgical seizure outcome with state-of-the-art accuracy, specificity and sensitivity (78%, 100%, and 33% respectively). This is the first feasibility study towards identifying relevant features in intracranial human data in real-time, on-chip, using event-based processors and spiking neural networks. By providing "neuromorphic intelligence" to neural recording circuits the approach proposed will pave the way for the development of systems that can detect HFO areas directly in the operation room and improve the seizure outcome of epilepsy surgery.
ROC-AUC Curve For Comprehensive Analysis Of Machine Learning Models
In machine learning when we build a model for classification tasks we do not build only a single model. We never rely on a single model since we have many different algorithms in machine learning that work differently on different datasets. We always have to build a model that best suits the respective data set so we try building different models and at last we choose the best performing model. For doing this comparison we cannot always rely on a metric like an accuracy score, the reason being for any imbalance data set the model will always predict the majority class. But it becomes important to check whether the positive class is predicted as the positive and negative class as negative by the model.
Online AUC Optimization for Sparse High-Dimensional Datasets
Zhou, Baojian, Ying, Yiming, Skiena, Steven
The Area Under the ROC Curve (AUC) is a widely used performance measure for imbalanced classification arising from many application domains where high-dimensional sparse data is abundant. In such cases, each $d$ dimensional sample has only $k$ non-zero features with $k \ll d$, and data arrives sequentially in a streaming form. Current online AUC optimization algorithms have high per-iteration cost $\mathcal{O}(d)$ and usually produce non-sparse solutions in general, and hence are not suitable for handling the data challenge mentioned above. In this paper, we aim to directly optimize the AUC score for high-dimensional sparse datasets under online learning setting and propose a new algorithm, \textsc{FTRL-AUC}. Our proposed algorithm can process data in an online fashion with a much cheaper per-iteration cost $\mathcal{O}(k)$, making it amenable for high-dimensional sparse streaming data analysis. Our new algorithmic design critically depends on a novel reformulation of the U-statistics AUC objective function as the empirical saddle point reformulation, and the innovative introduction of the "lazy update" rule so that the per-iteration complexity is dramatically reduced from $\mathcal{O}(d)$ to $\mathcal{O}(k)$. Furthermore, \textsc{FTRL-AUC} can inherently capture sparsity more effectively by applying a generalized Follow-The-Regularized-Leader (FTRL) framework. Experiments on real-world datasets demonstrate that \textsc{FTRL-AUC} significantly improves both run time and model sparsity while achieving competitive AUC scores compared with the state-of-the-art methods. Comparison with the online learning method for logistic loss demonstrates that \textsc{FTRL-AUC} achieves higher AUC scores especially when datasets are imbalanced.
The Use of AI for Thermal Emotion Recognition: A Review of Problems and Limitations in Standard Design and Data
Ordun, Catherine, Raff, Edward, Purushotham, Sanjay
With the increased attention on thermal imagery for Covid-19 screening, the public sector may believe there are new opportunities to exploit thermal as a modality for computer vision and AI. Thermal physiology research has been ongoing since the late nineties. This research lies at the intersections of medicine, psychology, machine learning, optics, and affective computing. We will review the known factors of thermal vs. RGB imaging for facial emotion recognition. But we also propose that thermal imagery may provide a semi-anonymous modality for computer vision, over RGB, which has been plagued by misuse in facial recognition. However, the transition to adopting thermal imagery as a source for any human-centered AI task is not easy and relies on the availability of high fidelity data sources across multiple demographics and thorough validation. This paper takes the reader on a short review of machine learning in thermal FER and the limitations of collecting and developing thermal FER data for AI training. Our motivation is to provide an introductory overview into recent advances for thermal FER and stimulate conversation about the limitations in current datasets.
Local Post-Hoc Explanations for Predictive Process Monitoring in Manufacturing
Mehdiyev, Nijat, Fettke, Peter
This study proposes an innovative explainable process prediction solution to facilitate the data-driven decision making for process planning in manufacturing. After integrating the top-floor and shop-floor data obtained from various enterprise information systems especially from Manufacturing Execution Systems, a deep neural network was applied to predict the process outcomes. Since we aim to operationalize the delivered predictive insights by embedding them into decision making processes, it is essential to generate the relevant explanations for domain experts. To this end, two local post-hoc explanation approaches, Shapley Values and Individual Conditional Expectation (ICE) plots, are applied which are expected to enhance the decision-making capabilities by enabling experts to examine explanations from different perspectives. After assessing the predictive strength of the adopted deep neural networks with relevant binary classification evaluation measures, a discussion of the generated explanations is provided. Lastly, a brief discussion of ongoing activities in the scope of current emerging application and some aspects of future implementation plan concludes the study.
Gamma distribution-based sampling for imbalanced data
Kamalov, Firuz, Denisov, Dmitry
Imbalanced class distribution is a common problem in a number of fields including medical diagnostics, fraud detection, and others. It causes bias in classification algorithms leading to poor performance on the minority class data. In this paper, we propose a novel method for balancing the class distribution in data through intelligent resampling of the minority class instances. The proposed method is based on generating new minority instances in the neighborhood of the existing minority points via a gamma distribution. Our method offers a natural and coherent approach to balancing the data. We conduct a comprehensive numerical analysis of the new sampling technique. The experimental results show that the proposed method outperforms the existing state-of-the-art methods for imbalanced data. Concretely, the new sampling technique produces the best results on 12 out of 24 real life as well as synthetic datasets. For comparison, the SMOTE method achieves the top score on only 1 dataset. We conclude that the new technique offers a simple yet effective sampling approach to balance data.
Do the Machine Learning Models on a Crowd Sourced Platform Exhibit Bias? An Empirical Study on Model Fairness
Machine learning models are increasingly being used in important decision-making software such as approving bank loans, recommending criminal sentencing, hiring employees, and so on. It is important to ensure the fairness of these models so that no discrimination is made based on protected attribute (e.g., race, sex, age) while decision making. Algorithms have been developed to measure unfairness and mitigate them to a certain extent. In this paper, we have focused on the empirical evaluation of fairness and mitigations on real-world machine learning models. We have created a benchmark of 40 top-rated models from Kaggle used for 5 different tasks, and then using a comprehensive set of fairness metrics, evaluated their fairness. Then, we have applied 7 mitigation techniques on these models and analyzed the fairness, mitigation results, and impacts on performance. We have found that some model optimization techniques result in inducing unfairness in the models. On the other hand, although there are some fairness control mechanisms in machine learning libraries, they are not documented. The mitigation algorithm also exhibit common patterns such as mitigation in the post-processing is often costly (in terms of performance) and mitigation in the pre-processing stage is preferred in most cases. We have also presented different trade-off choices of fairness mitigation decisions. Our study suggests future research directions to reduce the gap between theoretical fairness aware algorithms and the software engineering methods to leverage them in practice.
Mean Average Precision for Clients
Disclaimer: This project was created for my clients because it's rather challenging to explain such a complex metric simply, therefore don't expect to see much of math or equations here, and please remember that I try to keep it simple. Accuracy is the most vanilla metric out there. Imagine we are doing classification of whether there is a dog in a picture. In order to test our classifier, we prepare a test set with pictures of both containing dogs and not. We then apply our classifier to every picture and get the predicted classes.
TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval
Awad, George, Butt, Asad A., Curtis, Keith, Lee, Yooyoung, Fiscus, Jonathan, Godil, Afzal, Delgado, Andrew, Zhang, Jesse, Godard, Eliot, Diduch, Lukas, Smeaton, Alan F., Graham, Yvette, Kraaij, Wessel, Quenot, Georges
The TREC Video Retrieval Evaluation (TRECVID) 2019 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last nineteen years this effort has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. TRECVID has been funded by NIST (National Institute of Standards and Technology) and other US government agencies. In addition, many organizations and individuals worldwide contribute significant time and effort. TRECVID 2019 represented a continuation of four tasks from TRECVID 2018. In total, 27 teams from various research organizations worldwide completed one or more of the following four tasks: 1. Ad-hoc Video Search (AVS) 2. Instance Search (INS) 3. Activities in Extended Video (ActEV) 4. Video to Text Description (VTT) This paper is an introduction to the evaluation framework, tasks, data, and measures used in the workshop.
Optimal Provable Robustness of Quantum Classification via Quantum Hypothesis Testing
Weber, Maurice, Liu, Nana, Li, Bo, Zhang, Ce, Zhao, Zhikuan
Quantum machine learning models have the potential to offer speedups and better predictive accuracy compared to their classical counterparts. However, these quantum algorithms, like their classical counterparts, have been shown to also be vulnerable to input perturbations, in particular for classification problems. These can arise either from noisy implementations or, as a worst-case type of noise, adversarial attacks. These attacks can undermine both the reliability and security of quantum classification algorithms. In order to develop defence mechanisms and to better understand the reliability of these algorithms, it is crucial to understand their robustness properties in presence of both natural noise sources and adversarial manipulation. From the observation that, unlike in the classical setting, measurements involved in quantum classification algorithms are naturally probabilistic, we uncover and formalize a fundamental link between binary quantum hypothesis testing (QHT) and provably robust quantum classification. Then from the optimality of QHT, we prove a robustness condition, which is tight under modest assumptions, and enables us to develop a protocol to certify robustness. Since this robustness condition is a guarantee against the worst-case noise scenarios, our result naturally extends to scenarios in which the noise source is known. Thus we also provide a framework to study the reliability of quantum classification protocols under more general settings.