Plotting

 Country


Fully Automated Myocardial Infarction Classification using Ordinary Differential Equations

arXiv.org Machine Learning

Portable, Wearable and Wireless electrocardiogram (ECG) Systems have the potential to be used as point-of-care for cardiovascular disease diagnostic systems. Such wearable and wireless ECG systems require automatic detection of cardiovascular disease. Even in the primary care, automation of ECG diagnostic systems will improve efficiency of ECG diagnosis and reduce the minimal training requirement of local healthcare workers. However, few fully automatic myocardial infarction (MI) disease detection algorithms have well been developed. This paper presents a novel automatic MI classification algorithm using second order ordinary differential equation (ODE) with time varying coefficients, which simultaneously captures morphological and dynamic feature of highly correlated ECG signals. By effectively estimating the unobserved state variables and the parameters of the second order ODE, the accuracy of the classification was significantly improved. The estimated time varying coefficients of the second order ODE were used as an input to the support vector machine (SVM) for the MI classification. The proposed method was applied to the PTB diagnostic ECG database within Physionet. The overall sensitivity, specificity, and classification accuracy of 12 lead ECGs for MI binary classifications were 98.7%, 96.4% and 98.3%, respectively. We also found that even using one lead ECG signals, we can reach accuracy as high as 97%. Multiclass MI classification is a challenging task but the developed ODE approach for 12 lead ECGs coupled with multiclass SVM reached 96.4% accuracy for classifying 5 subgroups of MI and healthy controls.


Performance Guarantees for Schatten-$p$ Quasi-Norm Minimization in Recovery of Low-Rank Matrices

arXiv.org Machine Learning

We address some theoretical guarantees for Schatten-$p$ quasi-norm minimization ($p \in (0,1]$) in recovering low-rank matrices from compressed linear measurements. Firstly, using null space properties of the measurement operator, we provide a sufficient condition for exact recovery of low-rank matrices. This condition guarantees unique recovery of matrices of ranks equal or larger than what is guaranteed by nuclear norm minimization. Secondly, this sufficient condition leads to a theorem proving that all restricted isometry property (RIP) based sufficient conditions for $\ell_p$ quasi-norm minimization generalize to Schatten-$p$ quasi-norm minimization. Based on this theorem, we provide a few RIP-based recovery conditions.


A Novel Statistical Method Based on Dynamic Models for Classification

arXiv.org Machine Learning

Realizations of stochastic process are often observed temporal data or functional data. There are growing interests in classification of dynamic or functional data. The basic feature of functional data is that the functional data have infinite dimensions and are highly correlated. An essential issue for classifying dynamic and functional data is how to effectively reduce their dimension and explore dynamic feature. However, few statistical methods for dynamic data classification have directly used rich dynamic features of the data. We propose to use second order ordinary differential equation (ODE) to model dynamic process and principal differential analysis to estimate constant or time-varying parameters in the ODE. We examine differential dynamic properties of the dynamic system across different conditions including stability and transient-response, which determine how the dynamic systems maintain their functions and performance under a broad range of random internal and external perturbations. We use the parameters in the ODE as features for classifiers. As a proof of principle, the proposed methods are applied to classifying normal and abnormal QRS complexes in the electrocardiogram (ECG) data analysis, which is of great clinical values in diagnosis of cardiovascular diseases. We show that the ODE-based classification methods in QRS complex classification outperform the currently widely used neural networks with Fourier expansion coefficients of the functional data as their features. We expect that the dynamic model-based classification methods may open a new avenue for functional data classification.


Local Rademacher Complexity for Multi-label Learning

arXiv.org Machine Learning

We analyze the local Rademacher complexity of empirical risk minimization (ERM)-based multi-label learning algorithms, and in doing so propose a new algorithm for multi-label learning. Rather than using the trace norm to regularize the multi-label predictor, we instead minimize the tail sum of the singular values of the predictor in multi-label learning. Benefiting from the use of the local Rademacher complexity, our algorithm, therefore, has a sharper generalization error bound and a faster convergence rate. Compared to methods that minimize over all singular values, concentrating on the tail singular values results in better recovery of the low-rank structure of the multi-label predictor, which plays an import role in exploiting label correlations. We propose a new conditional singular value thresholding algorithm to solve the resulting objective function. Empirical studies on real-world datasets validate our theoretical results and demonstrate the effectiveness of the proposed algorithm.


Estimating the intrinsic dimension in fMRI space via dataset fractal analysis - Counting the `cpu cores' of the human brain

arXiv.org Machine Learning

Functional Magnetic Resonance Imaging (fMRI) is a powerful non-invasive tool for localizing and analyzing brain activity. This study focuses on one very important aspect of the functional properties of human brain, specifically the estimation of the level of parallelism when performing complex cognitive tasks. Using fMRI as the main modality, the human brain activity is investigated through a purely data-driven signal processing and dimensionality analysis approach. Specifically, the fMRI signal is treated as a multi-dimensional data space and its intrinsic `complexity' is studied via dataset fractal analysis and blind-source separation (BSS) methods. One simulated and two real fMRI datasets are used in combination with Independent Component Analysis (ICA) and fractal analysis for estimating the intrinsic (true) dimensionality, in order to provide data-driven experimental evidence on the number of independent brain processes that run in parallel when visual or visuo-motor tasks are performed. Although this number is can not be defined as a strict threshold but rather as a continuous range, when a specific activation level is defined, a corresponding number of parallel processes or the casual equivalent of `cpu cores' can be detected in normal human brain activity.


Sparse Estimation using Bayesian Hierarchical Prior Modeling for Real and Complex Linear Models

arXiv.org Machine Learning

In sparse Bayesian learning (SBL), Gaussian scale mixtures (GSMs) have been used to model sparsity-inducing priors that realize a class of concave penalty functions for the regression task in real-valued signal models. Motivated by the relative scarcity of formal tools for SBL in complex-valued models, this paper proposes a GSM model - the Bessel K model - that induces concave penalty functions for the estimation of complex sparse signals. The properties of the Bessel K model are analyzed when it is applied to Type I and Type II estimation. This analysis reveals that, by tuning the parameters of the mixing pdf different penalty functions are invoked depending on the estimation type used, the value of the noise variance, and whether real or complex signals are estimated. Using the Bessel K model, we derive a sparse estimator based on a modification of the expectation-maximization algorithm formulated for Type II estimation. The estimator includes as a special instance the algorithms proposed by Tipping and Faul [1] and by Babacan et al. [2]. Numerical results show the superiority of the proposed estimator over these state-of-the-art estimators in terms of convergence speed, sparseness, reconstruction error, and robustness in low and medium signal-to-noise ratio regimes.


Concavity of reweighted Kikuchi approximation

arXiv.org Machine Learning

We analyze a reweighted version of the Kikuchi approximation for estimating the log partition function of a product distribution defined over a region graph. We establish sufficient conditions for the concavity of our reweighted objective function in terms of weight assignments in the Kikuchi expansion, and show that a reweighted version of the sum product algorithm applied to the Kikuchi region graph will produce global optima of the Kikuchi approximation whenever the algorithm converges. When the region graph has two layers, corresponding to a Bethe approximation, we show that our sufficient conditions for concavity are also necessary. Finally, we provide an explicit characterization of the polytope of concavity in terms of the cycle structure of the region graph. We conclude with simulations that demonstrate the advantages of the reweighted Kikuchi approach.


Parameterizing the semantics of fuzzy attribute implications by systems of isotone Galois connections

arXiv.org Artificial Intelligence

We study the semantics of fuzzy if-then rules called fuzzy attribute implications parameterized by systems of isotone Galois connections. The rules express dependencies between fuzzy attributes in object-attribute incidence data. The proposed parameterizations are general and include as special cases the parameterizations by linguistic hedges used in earlier approaches. We formalize the general parameterizations, propose bivalent and graded notions of semantic entailment of fuzzy attribute implications, show their characterization in terms of least models and complete axiomatization, and provide characterization of bases of fuzzy attribute implications derived from data.


On the Challenges of Physical Implementations of RBMs

arXiv.org Machine Learning

Restricted Boltzmann machines (RBMs) are powerful machine learning models, but learning and some kinds of inference in the model require sampling-based approximations, which, in classical digital computers, are implemented using expensive MCMC. Physical computation offers the opportunity to reduce the cost of sampling by building physical systems whose natural dynamics correspond to drawing samples from the desired RBM distribution. Such a system avoids the burn-in and mixing cost of a Markov chain. However, hardware implementations of this variety usually entail limitations such as low-precision and limited range of the parameters and restrictions on the size and topology of the RBM. We conduct software simulations to determine how harmful each of these restrictions is. Our simulations are designed to reproduce aspects of the D-Wave quantum computer, but the issues we investigate arise in most forms of physical computation.


Median Selection Subset Aggregation for Parallel Inference

arXiv.org Machine Learning

For massive data sets, efficient computation commonly relies on distributed algorithms that store and process subsets of the data on different machines, minimizing communication costs. Our focus is on regression and classification problems involving many features. A variety of distributed algorithms have been proposed in this context, but challenges arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. We propose a MEdian Selection Subset AGgregation Estimator (message) algorithm, which attempts to solve these problems. The algorithm applies feature selection in parallel for each subset using Lasso or another method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in both sample and feature size, and has theoretical guarantees. In particular, we show model selection consistency and coefficient estimation efficiency. Extensive experiments show excellent performance in variable selection, estimation, prediction, and computation time relative to usual competitors.