Accuracy
Convolutional Hashing for Automated Scene Matching
Loncaric, Martin, Liu, Bowei, Weber, Ryan
Many information retrieval tasks rely on high dimensional searches, including K-nearest neighbors (KNN), approximate nearest neighbors (ANN), and exact r-neighbor lookup in Hamming space. At scale, these searches are enabled by indexes on binary hashes, such as locality-sensitive hashing (LSH) and multi-indexing [1]. Recent research has flourished on these topics due to enormous growth in data volume and industry applications [2]. We present a powerful new approach to a fundamental challenge in these tasks: learning a good binary hash function. We demonstrate the effectiveness of our method by applying it to the task of automated scene matching (ASM) with a multi-index system. We call our model convolutional hashing for automated scene matching (CHASM). To the best of our knowledge, it is the first neural network to outperform state-of-the-art hash functions like Haar wavelets and color layout descriptors at ASM across the board.
Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection
Paudice, Andrea, Muñoz-González, Luis, Gyorgy, Andras, Lupu, Emil C.
Machine learning has become an important component for many systems and applications including computer vision, spam filtering, malware and network intrusion detection, among others. Despite the capabilities of machine learning algorithms to extract valuable information from data and produce accurate predictions, it has been shown that these algorithms are vulnerable to attacks. Data poisoning is one of the most relevant security threats against machine learning systems, where attackers can subvert the learning process by injecting malicious samples in the training data. Recent work in adversarial machine learning has shown that the so-called optimal attack strategies can successfully poison linear classifiers, degrading the performance of the system dramatically after compromising a small fraction of the training dataset. In this paper we propose a defence mechanism to mitigate the effect of these optimal poisoning attacks based on outlier detection. We show empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack. Hence, they can be detected with an appropriate pre-filtering of the training dataset.
CISOs Look to Machine Learning to Augment Security Staffing Shortages
There is a lot of new technology, a huge demand for our skills, and a bright future that promises only more work for us. Yet, this excitement is a two-edged blade. We often hear from peers about how hard it is to hire good security folks. My email box gets at least one mention a week from a peer looking for folks to bolster their team. But beyond anecdotes, we have data, as well.
Comparison of computer systems and ranking criteria for automatic melanoma detection in dermoscopic images
Møllersen, Kajsa, Zortea, Maciel, Schopf, Thomas R., Kirchesch, Herbert, Godtliebsen, Fred
Melanoma is the deadliest form of skin cancer. Computer systems can assist in melanoma detection, but are not widespread in clinical practice. In 2016, an open challenge in classification of dermoscopic images of skin lesions was announced. A training set of 900 images with corresponding class labels and semi-automatic/manual segmentation masks was released for the challenge. An independent test set of 379 images was used to rank the participants. This article demonstrates the impact of ranking criteria, segmentation method and classifier, and highlights the clinical perspective. We compare five different measures for diagnostic accuracy by analysing the resulting ranking of the computer systems in the challenge. Choice of performance measure had great impact on the ranking. Systems that were ranked among the top three for one measure, dropped to the bottom half when changing performance measure. Nevus Doctor, a computer system previously developed by the authors, was used to investigate the impact of segmentation and classifier. The unexpected small impact of automatic versus semi-automatic/manual segmentation suggests that improvements of the automatic segmentation method w.r.t. resemblance to semi-automatic/manual segmentation will not improve diagnostic accuracy substantially. A small set of similar classification algorithms are used to investigate the impact of classifier on the diagnostic accuracy. The variability in diagnostic accuracy for different classifier algorithms was larger than the variability for segmentation methods, and suggests a focus for future investigations. From a clinical perspective, the misclassification of a melanoma as benign has far greater cost than the misclassification of a benign lesion. For computer systems to have clinical impact, their performance should be ranked by a high-sensitivity measure.
A Theoretical Investigation of Graph Degree as an Unsupervised Normality Measure
Aytekin, Caglar, Cricri, Francesco, Fan, Lixin, Aksu, Emre
For a graph representation of a dataset, a straightforward normality measure for a sample can be its graph degree. Considering a weighted graph, degree of a sample is the sum of the corresponding row's values in a similarity matrix. The measure is intuitive given the abnormal samples are usually rare and they are dissimilar to the rest of the data. In order to have an in-depth theoretical understanding, in this manuscript, we investigate the graph degree in spectral graph clustering based and kernel based point of views and draw connections to a recent kernel method for the two sample problem. We show that our analyses guide us to choose fully-connected graphs whose edge weights are calculated via universal kernels. We show that a simple graph degree based unsupervised anomaly detection method with the above properties, achieves higher accuracy compared to other unsupervised anomaly detection methods on average over 10 widely used datasets. We also provide an extensive analysis on the effect of the kernel parameter on the method's accuracy.
An Occluded Stacked Hourglass Approach to Facial Landmark Localization and Occlusion Estimation
Yuen, Kevan, Trivedi, Mohan M.
A key step to driver safety is to observe the driver's activities with the face being a key step in this process to extracting information such as head pose, blink rate, yawns, talking to passenger which can then help derive higher level information such as distraction, drowsiness, intent, and where they are looking. In the context of driving safety, it is important for the system perform robust estimation under harsh lighting and occlusion but also be able to detect when the occlusion occurs so that information predicted from occluded parts of the face can be taken into account properly. This paper introduces the Occluded Stacked Hourglass, based on the work of original Stacked Hourglass network for body pose joint estimation, which is retrained to process a detected face window and output 68 occlusion heat maps, each corresponding to a facial landmark. Landmark location, occlusion levels and a refined face detection score, to reject false positives, are extracted from these heat maps. Using the facial landmark locations, features such as head pose and eye/mouth openness can be extracted to derive driver attention and activity. The system is evaluated for face detection, head pose, and occlusion estimation on various datasets in the wild, both quantitatively and qualitatively, and shows state-of-the-art results.
Highly accurate model for prediction of lung nodule malignancy with CT scans
Causey, Jason, Zhang, Junyu, Ma, Shiqian, Jiang, Bo, Qualls, Jake, Politte, David G., Prior, Fred, Zhang, Shuzhong, Huang, Xiuzhen
Computed tomography (CT) examinations are commonly used to predict lung nodule malignancy in patients, which are shown to improve noninvasive early diagnosis of lung cancer. It remains challenging for computational approaches to achieve performance comparable to experienced radiologists. Here we present NoduleX, a systematic approach to predict lung nodule malignancy from CT data, based on deep learning convolutional neural networks (CNN). For training and validation, we analyze >1000 lung nodules in images from the LIDC/IDRI cohort. All nodules were identified and classified by four experienced thoracic radiologists who participated in the LIDC project. NoduleX achieves high accuracy for nodule malignancy classification, with an AUC of ~0.99. This is commensurate with the analysis of the dataset by experienced radiologists. Our approach, NoduleX, provides an effective framework for highly accurate nodule malignancy prediction with the model trained on a large patient population. Our results are replicable with software available at http://bioinformatics.astate.edu/NoduleX.
The Best Metric to Measure Accuracy of Classification Models
Unlike evaluating the accuracy of models that predict a continuous or discrete dependent variable like Linear Regression models, evaluating the accuracy of a classification model could be more complex and time-consuming. Before measuring the accuracy of classification models, an analyst would first measure its robustness with the help of metrics such as AIC-BIC, AUC-ROC, AUC- PR, Kolmogorov-Smirnov chart, etc. The next logical step is to measure its accuracy. To understand the complexity behind measuring the accuracy, we need to know few basic concepts. E.g. – A classification model like Logistic Regression will output a probability number between 0 and 1 instead of the desired output of actual target variable like Yes/No, etc.
4 Reasons Your Machine Learning Model is Wrong (and How to Fix It)
There are a number of machine learning models to choose from. We can use Linear Regression to predict a value, Logistic Regression to classify distinct outcomes, and Neural Networks to model non-linear behaviors. When we build these models, we always use a set of historical data to help our machine learning algorithms learn what is the relationship between a set of input features to a predicted output. But even if this model can accurately predict a value from historical data, how do we know it will work as well on new data? Or more plainly, how do we evaluate whether a machine learning model is actually "good"?
Weighting Scheme for a Pairwise Multi-label Classifier Based on the Fuzzy Confusion Matrix
Trajdos, Pawel, Kurzynski, Marek
In this work we addressed the issue of applying a stochastic classifier and a local, fuzzy confusion matrix under the framework of multi-label classification. We proposed a novel solution to the problem of correcting label pairwise ensembles. The main step of the correction procedure is to compute classifier-specific competence and cross-competence measures, which estimates error pattern of the underlying classifier. At the fusion phase we employed two weighting approaches based on information theory. The classifier weights promote base classifiers which are the most susceptible to the correction based on the fuzzy confusion matrix. During the experimental study, the proposed approach was compared against two reference methods. The comparison was made in terms of six different quality criteria. The conducted experiments reveals that the proposed approach eliminates one of main drawbacks of the original FCM-based approach i.e. the original approach is vulnerable to the imbalanced class/label distribution. What is more, the obtained results shows that the introduced method achieves satisfying classification quality under all considered quality criteria. Additionally, the impact of fluctuations of data set characteristics is reduced.