Collaborating Authors

Fast, Accurate and Interpretable Time Series Classification Through Randomization Machine Learning

Time series classification (TSC) aims to predict the class label of a given time series, which is critical to a rich set of application areas such as economics and medicine. State-of-the-art TSC methods have mostly focused on classification accuracy and efficiency, without considering the interpretability of their classifications, which is an important property required by modern applications such as appliance modeling and legislation such as the European General Data Protection Regulation. To address this gap, we propose a novel TSC method - the Randomized-Supervised Time Series Forest (r-STSF). r-STSF is highly efficient, achieves state-of-the-art classification accuracy and enables interpretability. r-STSF takes an efficient interval-based approach to classify time series according to aggregate values of discriminatory sub-series (intervals). To achieve state-of-the-art accuracy, r-STSF builds an ensemble of randomized trees using the discriminatory sub-series. It uses four time series representations, nine aggregation functions and a supervised binary-inspired search combined with a feature ranking metric to identify highly discriminatory sub-series. The discriminatory sub-series enable interpretable classifications. Experiments on extensive datasets show that r-STSF achieves state-of-the-art accuracy while being orders of magnitude faster than most existing TSC methods. It is the only classifier from the state-of-the-art group that enables interpretability. Our findings also highlight that r-STSF is the best TSC method when classifying complex time series datasets.

Integrating global spatial features in CNN based Hyperspectral/SAR imagery classification Machine Learning

The land cover classification has played an important role in remote sensing because it can intelligently identify things in one huge remote sensing image to reduce the work of humans. However, a lot of classification methods are designed based on the pixel feature or limited spatial feature of the remote sensing image, which limits the classification accuracy and universality of their methods. This paper proposed a novel method to take into the information of remote sensing image, i.e., geographic latitude-longitude information. In addition, a dual-branch convolutional neural network (CNN) classification method is designed in combination with the global information to mine the pixel features of the image. Then, the features of the two neural networks are fused with another fully neural network to realize the classification of remote sensing images. Finally, two remote sensing images are used to verify the effectiveness of our method, including hyperspectral imaging (HSI) and polarimetric synthetic aperture radar (PolSAR) imagery. The result of the proposed method is superior to the traditional single-channel convolutional neural network.

Exploring Strategies for Classification of External Stimuli Using Statistical Features of the Plant Electrical Response Machine Learning

Plants sense their environment by producing electrical signals which in essence represent changes in underlying physiological processes. These electrical signals, when monitored, show both stochastic and deterministic dynamics. In this paper, we compute 11 statistical features from the raw non-stationary plant electrical signal time series to classify the stimulus applied (causing the electrical signal). By using different discriminant analysis based classification techniques, we successfully establish that there is enough information in the raw electrical signal to classify the stimuli. In the process, we also propose two standard features which consistently give good classification results for three types of stimuli - Sodium Chloride (NaCl), Sulphuric Acid (H2SO4) and Ozone (O3). This may facilitate reduction in the complexity involved in computing all the features for online classification of similar external stimuli in future.

Sequential Feature Classification in the Context of Redundancies Machine Learning

The problem of all-relevant feature selection is concerned with finding a relevant feature set with preserved redundancies. There exist several approximations to solve this problem but only one could give a distinction between strong and weak relevance. This approach was limited to the case of linear problems. In this work, we present a new solution for this distinction in the non-linear case through the use of random forest models and statistical methods.

Unsupervised feature learning for audio classification using convolutional deep belief networks

Neural Information Processing Systems

In recent years, deep learning approaches have gained significant interest as a way of building hierarchical representations from unlabeled data. However, to our knowledge, these deep learning approaches have not been extensively studied for auditory data. In this paper, we apply convolutional deep belief networks to audio data and empirically evaluate them on various audio classification tasks. For the case of speech data, we show that the learned features correspond to phones/phonemes. In addition, our feature representations trained from unlabeled audio data show very good performance for multiple audio classification tasks. We hope that this paper will inspire more research on deep learning approaches applied to a wide range of audio recognition tasks.