Goto

Collaborating Authors

 Performance Analysis


Proxy Fairness

arXiv.org Machine Learning

We consider the problem of improving fairness when one lacks access to a dataset labeled with protected groups, making it difficult to take advantage of strategies that can improve fairness but require protected group labels, either at training or runtime. To address this, we investigate improving fairness metrics for proxy groups, and test whether doing so results in improved fairness for the true sensitive groups. Results on benchmark and real-world datasets demonstrate that such a proxy fairness strategy can work well in practice. However, we caution that the effectiveness likely depends on the choice of fairness metric, as well as how aligned the proxy groups are with the true protected groups in terms of the constrained model parameters.


Recursive Neural Networks in Quark/Gluon Tagging

arXiv.org Machine Learning

Since the machine learning techniques are improving rapidly, it has been shown that the image recognition techniques in deep neural networks can be used to detect jet substructure. And it turns out that deep neural networks can match or outperform traditional approach of expert features. However, there are disadvantages such as sparseness of jet images. Based on the natural tree-like structure of jet sequential clustering, the recursive neural networks (RecNNs), which embed jet clustering history recursively as in natural language processing, have a better behavior when confronted with these problems. We thus try to explore the performance of RecNNs in quark/gluon discrimination. The results show that RecNNs work better than the baseline boosted decision tree (BDT) by a few percent in gluon rejection rate. However, extra implementation of particle flow identification only increases the performance slightly. We also experimented on some relevant aspects which might influence the performance of the networks. It shows that even taking only particle flow identification as input feature without any extra information on momentum or angular position is already giving a fairly good result, which indicates that the most of the information for quark/gluon discrimination is already included in the tree-structure itself. As a bonus, a rough up/down quark jets discrimination is also explored.


Beyond One-hot Encoding: lower dimensional target embedding

arXiv.org Artificial Intelligence

Target encoding plays a central role when learning Convolutional Neural Networks. In this realm, One-hot encoding is the most prevalent strategy due to its simplicity. However, this so widespread encoding schema assumes a flat label space, thus ignoring rich relationships existing among labels that can be exploited during training. In large-scale datasets, data does not span the full label space, but instead lies in a low-dimensional output manifold. Following this observation, we embed the targets into a low-dimensional space, drastically improving convergence speed while preserving accuracy. Our contribution is two fold: (i) We show that random projections of the label space are a valid tool to find such lower dimensional embeddings, boosting dramatically convergence rates at zero computational cost; and (ii) we propose a normalized eigenrepresentation of the class manifold that encodes the targets with minimal information loss, improving the accuracy of random projections encoding while enjoying the same convergence rates. Experiments on CIFAR-100, CUB200-2011, Imagenet, and MIT Places demonstrate that the proposed approach drastically improves convergence speed while reaching very competitive accuracy rates.


Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels

arXiv.org Artificial Intelligence

One important assumption underlying common classification models is the stationarity of the data. However, in real-world streaming applications, the data concept indicated by the joint distribution of feature and label is not stationary but drifting over time. Concept drift detection aims to detect such drifts and adapt the model so as to mitigate any deterioration in the model's predictive performance. Unfortunately, most existing concept drift detection methods rely on a strong and over-optimistic condition that the true labels are available immediately for all already classified instances. In this paper, a novel Hierarchical Hypothesis Testing framework with Request-and-Reverify strategy is developed to detect concept drifts by requesting labels only when necessary. Two methods, namely Hierarchical Hypothesis Testing with Classification Uncertainty (HHT-CU) and Hierarchical Hypothesis Testing with Attribute-wise "Goodness-of-fit" (HHT-AG), are proposed respectively under the novel framework. In experiments with benchmark datasets, our methods demonstrate overwhelming advantages over state-of-the-art unsupervised drift detectors. More importantly, our methods even outperform DDM (the widely used supervised drift detector) when we use significantly fewer labels.


This Japanese AI security camera shows the future of surveillance will be automated

#artificialintelligence

The world of automated surveillance is booming, with new machine learning techniques giving CCTV cameras the ability to spot troubling behavior without human supervision. And sooner or later, this tech will be coming to a store near you -- as illustrated by a new AI security cam built by Japanese telecom giant NTT East and startup Earth Eyes Corp. The security camera is called the "AI Guardman" and is designed to help shop owners in Japan spot potential shoplifters. It uses open source technology developed by Carnegie Mellon University to scan live video streams and estimate the poses of any bodies it can see. The system then tries to match this pose data to predefined'suspicious' behavior. If it sees something noteworthy, it alerts shopkeepers via a connected app.


A comparative study of artificial intelligence and human doctors for the purpose of triage and diagnosis

arXiv.org Artificial Intelligence

Online symptom checkers have significant potential to improve patient care, however their reliability and accuracy remain variable. We hypothesised that an artificial intelligence (AI) powered triage and diagnostic system would compare favourably with human doctors with respect to triage and diagnostic accuracy. We performed a prospective validation study of the accuracy and safety of an AI powered triage and diagnostic system. Identical cases were evaluated by both an AI system and human doctors. Differential diagnoses and triage outcomes were evaluated by an independent judge, who was blinded from knowing the source (AI system or human doctor) of the outcomes. Independently of these cases, vignettes from publicly available resources were also assessed to provide a benchmark to previous studies and the diagnostic component of the MRCGP exam. Overall we found that the Babylon AI powered Triage and Diagnostic System was able to identify the condition modelled by a clinical vignette with accuracy comparable to human doctors (in terms of precision and recall). In addition, we found that the triage advice recommended by the AI System was, on average, safer than that of human doctors, when compared to the ranges of acceptable triage provided by independent expert judges, with only a minimal reduction in appropriateness.


A Machine-learning framework for automatic reference-free quality assessment in MRI

arXiv.org Machine Learning

Magnetic resonance (MR) imaging offers a wide variety of imaging techniques. A large amount of data is created per examination which needs to be checked for sufficient quality in order to derive a meaningful diagnosis. This is a manual process and therefore time- and cost-intensive. Any imaging artifacts originating from scanner hardware, signal processing or induced by the patient may reduce the image quality and complicate the diagnosis or any image post-processing. Therefore, the assessment or the ensurance of sufficient image quality in an automated manner is of high interest. Usually no reference image is available or difficult to define. Therefore, classical reference-based approaches are not applicable. Model observers mimicking the human observers (HO) can assist in this task. Thus, we propose a new machine-learning-based reference-free MR image quality assessment framework which is trained on HO-derived labels to assess MR image quality immediately after each acquisition. We include the concept of active learning and present an efficient blinded reading platform to reduce the effort in the HO labeling procedure. Derived image features and the applied classifiers (support-vector-machine, deep neural network) are investigated for a cohort of 250 patients. The MR image quality assessment framework can achieve a high test accuracy of 93.7$\%$ for estimating quality classes on a 5-point Likert-scale. The proposed MR image quality assessment framework is able to provide an accurate and efficient quality estimation which can be used as a prospective quality assurance including automatic acquisition adaptation or guided MR scanner operation, and/or as a retrospective quality assessment including support of diagnostic decisions or quality control in cohort studies.


Mimic and Classify : A meta-algorithm for Conditional Independence Testing

arXiv.org Machine Learning

Given independent samples generated from the joint distribution $p(\mathbf{x},\mathbf{y},\mathbf{z})$, we study the problem of Conditional Independence (CI-Testing), i.e., whether the joint equals the CI distribution $p^{CI}(\mathbf{x},\mathbf{y},\mathbf{z})= p(\mathbf{z}) p(\mathbf{y}|\mathbf{z})p(\mathbf{x}|\mathbf{z})$ or not. We cast this problem under the purview of the proposed, provable meta-algorithm, "Mimic and Classify", which is realized in two-steps: (a) Mimic the CI distribution close enough to recover the support, and (b) Classify to distinguish the joint and the CI distribution. Thus, as long as we have a good generative model and a good classifier, we potentially have a sound CI Tester. With this modular paradigm, CI Testing becomes amiable to be handled by state-of-the-art, both generative and classification methods from the modern advances in Deep Learning, which in general can handle issues related to curse of dimensionality and operation in small sample regime. We show intensive numerical experiments on synthetic and real datasets where new mimic methods such conditional GANs, Regression with Neural Nets, outperform the current best CI Testing performance in the literature. Our theoretical results provide analysis on the estimation of null distribution as well as allow for general measures, i.e., when either some of the random variables are discrete and some are continuous or when one or more of them are discrete-continuous mixtures.


Track Xplorer: A System for Visual Analysis of Sensor-based Motor Activity Predictions

arXiv.org Artificial Intelligence

With the rapid commoditization of wearable sensors, detecting human movements from sensor datasets has become increasingly common over a wide range of applications. To detect activities, data scientists iteratively experiment with different classifiers before deciding which model to deploy. Effective reasoning about and comparison of alternative classifiers are crucial in successful model development. This is, however, inherently difficult in developing classifiers for sensor data, where the intricacy of long temporal sequences, high prediction frequency, and imprecise labeling make standard evaluation methods relatively ineffective and even misleading. We introduce Track Xplorer, an interactive visualization system to query, analyze, and compare the predictions of sensor-data classifiers. Track Xplorer enables users to interactively explore and compare the results of different classifiers, and assess their accuracy with respect to the ground-truth labels and video. Through integration with a version control system, Track Xplorer supports tracking of models and their parameters without additional workload on model developers. Track Xplorer also contributes an extensible algebra over track representations to filter, compose, and compare classification outputs, enabling users to reason effectively about classifier performance. We apply Track Xplorer in a collaborative project to develop classifiers to detect movements from multisensor data gathered from Parkinson's disease patients. We demonstrate how Track Xplorer helps identify early on possible systemic data errors, effectively track and compare the results of different classifiers, and reason about and pinpoint the causes of misclassifications.


Equalizing Financial Impact in Supervised Learning

arXiv.org Machine Learning

Machine learning is revolutionizing the way we interact with the world. Popular websites use algorithms to analyze user data and recommend videos, customize social media feeds, and optimize advertisements. Unsurprisingly, machine learning is taking a large role in making decisions about human beings, ranging from credit to parole decisions, and is likely to be more and more widely used in the future. It is not hard to imagine that, even in cases where the final decisions are made by people, they will be doing so with advice from algorithms that make inferences from patterns in petabytes of data. Some proponents of machine learning have suggested that not only are these algorithms able to leverage the increasing amount of data we have access to, but also that they might be able to make these decisions more fairly, as they seem to not be subject to human biases. There is some truth to these claims.