Goto

Collaborating Authors

 Accuracy


A survey of statistical learning techniques as applied to inexpensive pediatric Obstructive Sleep Apnea data

arXiv.org Machine Learning

Obstructive sleep apnea (OSA), a form of sleep-disordered breathing characterized by recurrent episodes of partial or complete airway obstruction during sleep, is a serious health problem, affecting an estimated 1-5% of elementary school-aged children [9, 2]. Even mild forms of untreated pediatric OSA may cause high blood pressure, behavioral challenges, or impeded growth. Compared to adults, the symptoms of childhood-onset OSA are more varied and change continuously with development, making diagnosis a difficult challenge. The complexity of the data from surveys, biomedical measurements, 3D facial photos, and time-series data calls for state of the art techniques from mathematics and data science. Clinical data, including that considered in confirming or ruling out a diagnosis of pediatric OSA, consist of high-dimensional multi-mode data with mixtures of variables of disparate types (e.g., nominal and categorical data of different scales, interval data, time-to-event and longitudinal outcomes) also called mixed or noncommensurate data.


Learning Fairness-aware Relational Structures

arXiv.org Artificial Intelligence

The development of fair machine learning models that effectively avert bias and discrimination is an important problem that has garnered attention in recent years. The necessity of encoding complex relational dependencies among the features and variables for competent predictions require the development of fair, yet expressive relational models. In this work, we introduce Fair-A3SL, a fairness-aware structure learning algorithm for learning relational structures, which incorporates fairness measures while learning relational graphical model structures. Our approach is versatile in being able to encode a wide range of fairness metrics such as statistical parity difference, overestimation, equalized odds, and equal opportunity, including recently proposed relational fairness measures. While existing approaches employ the fairness measures on pre-determined model structures post prediction, Fair-A3SL directly learns the structure while optimizing for the fairness measures and hence is able to remove any structural bias in the model. We demonstrate the effectiveness of our learned model structures when compared with the state-of-the-art fairness models quantitatively and qualitatively on datasets representing three different modeling scenarios: i) a relational dataset, ii) a recidivism prediction dataset widely used in studying discrimination, and iii) a recommender systems dataset. Our results show that Fair-A3SL can learn fair, yet interpretable and expressive structures capable of making accurate predictions.


Aspect Term Extraction using Graph-based Semi-Supervised Learning

arXiv.org Machine Learning

Aspect based Sentiment Analysis is a major subarea of sentiment analysis. Many supervised and unsupervised approaches have been proposed in the past for detecting and analyzing the sentiment of aspect terms. In this paper, a graph-based semi-supervised learning approach for aspect term extraction is proposed. In this approach, every identified token in the review document is classified as aspect or non-aspect term from a small set of labeled tokens using label spreading algorithm. The k-Nearest Neighbor (kNN) for graph sparsification is employed in the proposed approach to make it more time and memory efficient. The proposed work is further extended to determine the polarity of the opinion words associated with the identified aspect terms in review sentence to generate visual aspect-based summary of review documents. The experimental study is conducted on benchmark and crawled datasets of restaurant and laptop domains with varying value of labeled instances. The results depict that the proposed approach could achieve good result in terms of Precision, Recall and Accuracy with limited availability of labeled data.


Nystr\"om Subspace Learning for Large-scale SVMs

arXiv.org Machine Learning

As an implementation of the Nystr\"{o}m method, Nystr\"{o}m computational regularization (NCR) imposed on kernel classification and kernel ridge regression has proven capable of achieving optimal bounds in the large-scale statistical learning setting, while enjoying much better time complexity. In this study, we propose a Nystr\"{o}m subspace learning (NSL) framework to reveal that all you need for employing the Nystr\"{o}m method, including NCR, upon any kernel SVM is to use the efficient off-the-shelf linear SVM solvers as a black box. Based on our analysis, the bounds developed for the Nystr\"{o}m method are linked to NSL, and the analytical difference between two distinct implementations of the Nystr\"{o}m method is clearly presented. Besides, NSL also leads to sharper theoretical results for the clustered Nystr\"{o}m method. Finally, both regression and classification tasks are performed to compare two implementations of the Nystr\"{o}m method.


Towards Certifiable Adversarial Sample Detection

arXiv.org Machine Learning

Convolutional Neural Networks (CNNs) are deployed in more and more classification systems, but adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. There have been various proposals to improve CNNs' adversarial robustness but these all suffer performance penalties or other limitations. In this paper, we provide a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT). The system can provide certifiable guarantees of detection of adversarial inputs for certain $l_{\infty}$ sizes on a reasonable assumption, namely that the training data have the same distribution as the test data. We develop and evaluate several versions of CTT with a range of defense capabilities, training overheads and certifiability on adversarial samples. Against adversaries with various $l_p$ norms, CTT outperforms existing defense methods that focus purely on improving network robustness. We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies.


Interpretability of machine learning based prediction models in healthcare

arXiv.org Machine Learning

There is a need of ensuring machine learning models that are interpretable. Higher interpretability of the model means easier comprehension and explanation of future predictions for end-users. Further, interpretable machine learning models allow healthcare experts to make reasonable and data-driven decisions to provide personalized decisions that can ultimately lead to higher quality of service in healthcare. Generally, we can classify interpretability approaches in two groups where the first focuses on personalized interpretation (local interpretability) while the second summarizes prediction models on a population level (global interpretability). Alternatively, we can group interpretability methods into model-specific techniques, which are designed to interpret predictions generated by a specific model, such as a neural network, and model-agnostic approaches, which provide easy-to-understand explanations of predictions made by any machine learning model. Here, we give an overview of interpretability approaches and provide examples of practical interpretability of machine learning in different areas of healthcare, including prediction of health-related outcomes, optimizing treatments or improving the efficiency of screening for specific conditions. Further, we outline future directions for interpretable machine learning and highlight the importance of developing algorithmic solutions that can enable machine-learning driven decision making in high-stakes healthcare problems.


Classification and Disease Localization in Histopathology Using Only Global Labels: A Weakly-Supervised Approach

arXiv.org Machine Learning

Analysis of histopathology slides is a critical step for many diagnoses, and in particular in oncology where it defines the gold standard. In the case of digital histopathological analysis, highly trained pathologists must review vast whole-slide-images of extreme digital resolution ($100,000^2$ pixels) across multiple zoom levels in order to locate abnormal regions of cells, or in some cases single cells, out of millions. The application of deep learning to this problem is hampered not only by small sample sizes, as typical datasets contain only a few hundred samples, but also by the generation of ground-truth localized annotations for training interpretable classification and segmentation models. We propose a method for disease localization in the context of weakly supervised learning, where only image-level labels are available during training. Even without pixel-level annotations, we are able to demonstrate performance comparable with models trained with strong annotations on the Camelyon-16 lymph node metastases detection challenge. We accomplish this through the use of pre-trained deep convolutional networks, feature embedding, as well as learning via top instances and negative evidence, a multiple instance learning technique from the field of semantic segmentation and object detection.


A Model-Based, Decision-Theoretic Perspective on Automated Cyber Response

arXiv.org Artificial Intelligence

Cyber-attacks can occur at machine speeds that are far too fast for human-in-the-loop (or sometimes on-the-loop) decision making to be a viable option. Although human inputs are still important, a defensive Artificial Intelligence (AI) system must have considerable autonomy in these circumstances. When the AI system is model-based, its behavior responses can be aligned with risk-aware cost/benefit tradeoffs that are defined by user-supplied preferences that capture the key aspects of how human operators understand the system, the adversary and the mission. This paper describes an approach to automated cyber response that is designed along these lines. We combine a simulation of the system to be defended with an anytime online planner to solve cyber defense problems characterized as partially observable Markov decision problems (POMDPs).


Optimizing Black-box Metrics with Adaptive Surrogates

arXiv.org Artificial Intelligence

We address the problem of training models with black-box and hard-to-optimize metrics by expressing the metric as a monotonic function of a small number of easy-to-optimize surrogates. We pose the training problem as an optimization over a relaxed surrogate space, which we solve by estimating local gradients for the metric and performing inexact convex projections. We analyze gradient estimates based on finite differences and local linear interpolations, and show convergence of our approach under smoothness assumptions with respect to the surrogates. Experimental results on classification and ranking problems verify the proposal performs on par with methods that know the mathematical formulation, and adds notable value when the form of the metric is unknown.


Comparing AUCs of Machine Learning Models with DeLong's Test

#artificialintelligence

Have you ever wondered how to demonstrate that one machine learning model's test set performance differs significantly from the test set performance of an alternative model? This post will describe how to use DeLong's test to obtain a p-value for whether one model has a significantly different AUC than another model, where AUC refers to the area under the receiver operating characteristic. This post includes a hand-calculated example to illustrate all the steps in DeLong's test for a small data set. It also includes an example R implementation of DeLong's test to enable efficient calculation on large data sets. An example use case for DeLong's test: Model A predicts heart disease risk with AUC of 0.92, and Model B predicts heart disease risk with AUC of 0.87, and we use DeLong's test to demonstrate that Model A has a significantly different AUC from Model B with p 0.05.