Goto

Collaborating Authors

 Accuracy


Persistent homology machine learning for fingerprint classification

arXiv.org Machine Learning

The fingerprint classification problem is to sort fingerprints into pre-determined groups, such as arch, loop, and whorl. It was asserted in the literature that minutiae points, which are commonly used for fingerprint matching, are not useful for classification. We show that, to the contrary, near state-of-the-art classification accuracy rates can be achieved when applying topological data analysis (TDA) to 3-dimensional point clouds of oriented minutiae points. We also apply TDA to fingerprint ink-roll images, which yields a lower accuracy rate but still shows promise, particularly since the only preprocessing is cropping; moreover, combining the two approaches outperforms each one individually. These methods use supervised learning applied to persistent homology and allow us to explore feature selection on barcodes, an important topic at the interface between TDA and machine learning. We test our classification algorithms on the NIST fingerprint database SD-27.


STORE: Sparse Tensor Response Regression and Neuroimaging Analysis

arXiv.org Machine Learning

Motivated by applications in neuroimaging analysis, we propose a new regression model, Sparse TensOr REsponse regression (STORE), with a tensor response and a vector predictor. STORE embeds two key sparse structures: element-wise sparsity and low-rankness. It can handle both a non-symmetric and a symmetric tensor response, and thus is applicable to both structural and functional neuroimaging data. We formulate the parameter estimation as a non-convex optimization problem, and develop an efficient alternating updating algorithm. We establish a non-asymptotic estimation error bound for the actual estimator obtained from the proposed algorithm. This error bound reveals an interesting interaction between the computational efficiency and the statistical rate of convergence. When the distribution of the error tensor is Gaussian, we further obtain a fast estimation error rate which allows the tensor dimension to grow exponentially with the sample size. We illustrate the efficacy of our model through intensive simulations and an analysis of the Autism spectrum disorder neuroimaging data.


Understanding machine learning #3: Confusion matrix - not all errors are equal

@machinelearnbot

One of the most typical tasks in machine learning is classification tasks. It may seem that evaluating the effectiveness of such a model is easy. Let's assume that we have a model which, based on historical data, calculates if a client will pay back credit obligations. We evaluate 100 bank customers and our model correctly guesses in 93 instances. That may appear to be a good result – but is it really?


Learning to Predict with Highly Granular Temporal Data: Estimating individual behavioral profiles with smart meter data

arXiv.org Machine Learning

Big spatio-temporal datasets, available through both open and administrative data sources, offer significant potential for social science research. The magnitude of the data allows for increased resolution and analysis at individual level. While there are recent advances in forecasting techniques for highly granular temporal data, little attention is given to segmenting the time series and finding homogeneous patterns. In this paper, it is proposed to estimate behavioral profiles of individuals' activities over time using Gaussian Process-based models. In particular, the aim is to investigate how individuals or groups may be clustered according to the model parameters. Such a Bayesian non-parametric method is then tested by looking at the predictability of the segments using a combination of models to fit different parts of the temporal profiles. Model validity is then tested on a set of holdout data. The dataset consists of half hourly energy consumption records from smart meters from more than 100,000 households in the UK and covers the period from 2015 to 2016. The methodological approach developed in the paper may be easily applied to datasets of similar structure and granularity, for example social media data, and may lead to improved accuracy in the prediction of social dynamics and behavior.


Dynamic classifier chains for multi-label learning

arXiv.org Machine Learning

In this paper, we deal with the task of building a dynamic ensemble of chain classifiers for multi-label classification. To do so, we proposed two concepts of classifier chains algorithms that are able to change label order of the chain without rebuilding the entire model. Such modes allows anticipating the instance-specific chain order without a significant increase in computational burden. The proposed chain models are built using the Naive Bayes classifier and nearest neighbour approach as a base single-label classifiers. To take the benefits of the proposed algorithms, we developed a simple heuristic that allows the system to find relatively good label order. The heuristic sort labels according to the label-specific classification quality gained during the validation phase. The heuristic tries to minimise the phenomenon of error propagation in the chain. The experimental results showed that the proposed model based on Naive Bayes classifier the above-mentioned heuristic is an efficient tool for building dynamic chain classifiers.


Training large margin host-pathogen protein-protein interaction predictors

arXiv.org Machine Learning

Detection of protein-protein interactions (PPIs) plays a vital role in molecular biology. Particularly, infections are caused by the interactions of host and pathogen proteins. It is important to identify host-pathogen interactions (HPIs) to discover new drugs to counter infectious diseases. Conventional wet lab PPI prediction techniques have limitations in terms of large scale application and budget. Hence, computational approaches are developed to predict PPIs. This study aims to develop large margin machine learning models to predict interspecies PPIs with a special interest in host-pathogen protein interactions (HPIs). Especially, we focus on seeking answers to three queries that arise while developing an HPI predictor. 1) How should we select negative samples? 2) What should be the size of negative samples as compared to the positive samples? 3) What type of margin violation penalty should be used to train the predictor? We compare two available methods for negative sampling. Moreover, we propose a new method of assigning weights to each training example in weighted SVM depending on the distance of the negative examples from the positive examples. We have also developed a web server for our HPI predictor called HoPItor (Host Pathogen Interaction predicTOR) that can predict interactions between human and viral proteins. This webserver can be accessed at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#HoPItor.


Adapt DevOps to cognitive and artificial intelligence systems

#artificialintelligence

These new applications require a new way of thinking about the development process. Traditional application development has been enhanced by the idea of DevOps, which forces operational considerations into development time, execution, and process. In this tutorial, we outline a "cognitive DevOps" process that refines and adapts the best parts of DevOps for new cognitive applications. Specifically, we cover applying DevOps to the training process of cognitive systems including training data, modeling, and performance evaluation. A cognitive or artificial intelligence (AI) system fundamentally exhibits capabilities such as understanding, reasoning, and learning from data. At a deeper level, the system is built upon a combination of various types of cognitive tasks, which, when combined, make up a part of the overall cognitive application. The science upon which a cognitive system is built includes, but is not limited to, machine learning (ML) including deep learning and natural language processing.


BET on Independence

arXiv.org Machine Learning

We study the problem of nonparametric dependence detection. Many existing methods suffer severe power loss due to non-uniform consistency, which we illustrate with a paradox. To avoid such power loss, we approach the nonparametric test of independence through the new framework of binary expansion statistics (BEStat) and binary expansion testing (BET), which examine dependence through a novel binary expansion filtration approximation of the copula. Through a Hadamard-Walsh transform, we find that the cross interactions of binary variables in the filtration are complete sufficient statistics for dependence. These interactions are also uncorrelated under the null. By utilizing these interactions, the BET avoids the problem of non-uniform consistency and improves upon a wide class of commonly used methods (a) by achieving the minimax rate in sample size requirement for specified power and (b) by providing clear interpretations of global and local relationships upon rejection of independence. The binary expansion approach also connects the test statistics with the current computing system to facilitate efficient bitwise implementation. We illustrate the BET by a study of the distribution of stars in the night sky and by an exploratory data analysis of the TCGA breast cancer data.


Finding Differentially Covarying Needles in a Temporally Evolving Haystack: A Scan Statistics Perspective

arXiv.org Machine Learning

Recent results in coupled or temporal graphical models offer schemes for estimating the relationship structure between features when the data come from related (but distinct) longitudinal sources. A novel application of these ideas is for analyzing group-level differences, i.e., in identifying if trends of estimated objects (e.g., covariance or precision matrices) are different across disparate conditions (e.g., gender or disease). Often, poor effect sizes make detecting the differential signal over the full set of features difficult: for example, dependencies between only a subset of features may manifest differently across groups. In this work, we first give a parametric model for estimating trends in the space of SPD matrices as a function of one or more covariates. We then generalize scan statistics to graph structures, to search over distinct subsets of features (graph partitions) whose temporal dependency structure may show statistically significant group-wise differences. We theoretically analyze the Family Wise Error Rate (FWER) and bounds on Type 1 and Type 2 error. On a cohort of individuals with risk factors for Alzheimer's disease (but otherwise cognitively healthy), we find scientifically interesting group differences where the default analysis, i.e., models estimated on the full graph, do not survive reasonable significance thresholds.


Concept Drift Detection and Adaptation with Hierarchical Hypothesis Testing

arXiv.org Machine Learning

Effective techniques for analyzing and detecting changes in streaming data, especially in the era of big data, pose new challenges to the machine learning and the statistics community [1], [2]. As a result, early approaches for detecting statistical changes in a time series (such as change point detection), have had to be extended for online detection of changes in a multivariate data streams [3], [4]. Some of these techniques for detecting the intrinsic change in the relationship of the incoming data streams have been applied to numerous real-world applications, such as fraud detection, user preference prediction and email filtering, [5], [6]. Online classification is another common task performed on streaming multivariate time series data that takes advantage of these statistical relationships to predict a class label at each time index [7]. If the underlying source generating the data is not stationary, the optimal decision rule for the classifier would change over time - a phenomena known as concept drift [8]. Given the impact of concept drift on the predictive performance of an online classifier, there is a need to detect these concept drifts as early as possible. The inability of change point detection approaches to detect these concept drifts, has motivated the need for concept drift detection approaches that not only monitor the join distribution of a multivariate data stream but also changes in its relationship to the class labels of the streaming data. Shujian Yu and José C. Príncipe are with the Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611, USA.