sensor 1
SincPD: An Explainable Method based on Sinc Filters to Diagnose Parkinson's Disease Severity by Gait Cycle Analysis
Salimi-Badr, Armin, Veisi, Mahan, Berangi, Sadra
In this paper, an explainable deep learning-based classifier based on adaptive sinc filters for Parkinson's Disease diagnosis (PD) along with determining its severity, based on analyzing the gait cycle (SincPD) is presented. Considering the effects of PD on the gait cycle of patients, the proposed method utilizes raw data in the form of vertical Ground Reaction Force (vGRF) measured by wearable sensors placed in soles of subjects' shoes. The proposed method consists of Sinc layers that model adaptive bandpass filters to extract important frequency-bands in gait cycle of patients along with healthy subjects. Therefore, by considering these frequencies, the reasons behind the classification a person as a patient or healthy can be explained. In this method, after applying some preprocessing processes, a large model equipped with many filters is first trained. Next, to prune the extra units and reach a more explainable and parsimonious structure, the extracted filters are clusters based on their cut-off frequencies using a centroid-based clustering approach. Afterward, the medoids of the extracted clusters are considered as the final filters. Therefore, only 15 bandpass filters for each sensor are derived to classify patients and healthy subjects. Finally, the most effective filters along with the sensors are determined by comparing the energy of each filter encountering patients and healthy subjects.
On Learning what to Learn: heterogeneous observations of dynamics and establishing (possibly causal) relations among them
Sroczynski, David W., Dietrich, Felix, Koronaki, Eleni D., Talmon, Ronen, Coifman, Ronald R., Bollt, Erik, Kevrekidis, Ioannis G.
Before we attempt to learn a function between two (sets of) observables of a physical process, we must first decide what the inputs and what the outputs of the desired function are going to be. Here we demonstrate two distinct, data-driven ways of initially deciding ``the right quantities'' to relate through such a function, and then proceed to learn it. This is accomplished by processing multiple simultaneous heterogeneous data streams (ensembles of time series) from observations of a physical system: multiple observation processes of the system. We thus determine (a) what subsets of observables are common between the observation processes (and therefore observable from each other, relatable through a function); and (b) what information is unrelated to these common observables, and therefore particular to each observation process, and not contributing to the desired function. Any data-driven function approximation technique can subsequently be used to learn the input-output relation, from k-nearest neighbors and Geometric Harmonics to Gaussian Processes and Neural Networks. Two particular ``twists'' of the approach are discussed. The first has to do with the identifiability of particular quantities of interest from the measurements. We now construct mappings from a single set of observations of one process to entire level sets of measurements of the process, consistent with this single set. The second attempts to relate our framework to a form of causality: if one of the observation processes measures ``now'', while the second observation process measures ``in the future'', the function to be learned among what is common across observation processes constitutes a dynamical model for the system evolution.
Industry 4.0 Projects on NASA Turbofan Engines -- Part 1
Despite being released more than a decade ago, NASA's turbofan engine degradation simulation dataset (CMAPSS) remains popular and relevant today. In this series, I plan to demonstrate and explain multiple analysis techniques while providing a solution for more complex datasets. The Turbofan dataset has four datasets of increasing complexity. Engines start normally but develop a malfunction over time. For train sets, engines are run to fail, while on test sets the time series expires'a period' before they fail.