Plotting

 Anton, Cristina


Cluster weighted models with multivariate skewed distributions for functional data

arXiv.org Machine Learning

Cluster weighted models with multivariate skewed distributions for functional data Cristina Anton, 1 Roy Shivam Ram Shreshtth 2 1 Department of Mathematics and Statistics, MacEwan University, 103C, 10700-104 Ave., Edmonton, AB T5J 4S2, Canada, email: popescuc@macewan.ca 2 Department of Mathematics and Statistics, Indian Institute of Technology Kanpur Abstract We propose a clustering method, funWeightClustSkew, based on mixtures of functional linear regression models and three skewed multivariate distributions: the variance-gamma distribution, the skew-t distribution, and the normal-inverse Gaussian distribution. Our approach follows the framework of the functional high dimensional data clustering (funHDDC) method, and we extend to functional data the cluster weighted models based on skewed distributions used for finite dimensional multivariate data. We consider several parsimonious models, and to estimate the parameters we construct an expectation maximization (EM) algorithm. We illustrate the performance of funWeightClustSkew for simulated data and for the Air Quality dataset. Keywords: Cluster weighted models, Functional linear regression, EM algorithm, Skewed distributions, Multivariate functional principal component analysis 1 Introduction Smart devices and other modern technologies record huge amounts of data measured continuously in time. These data are better represented as curves instead of finite-dimensional vectors, and they are analyzed using statistical methods specific to functional data (Ramsay and Silverman, 2006; Ferraty and Vieu, 2006; Horv ath and Kokoszka, 2012). Many times more than one curve is collected for one individual, e.g.


Cluster weighted models for functional data

arXiv.org Machine Learning

We propose a method, funWeightClust, based on a family of parsimonious models for clustering heterogeneous functional linear regression data. These models extend cluster weighted models to functional data, and they allow for multivariate functional responses and predictors. The proposed methodology follows the approach used by the the functional high dimensional data clustering (funHDDC) method. We construct an expectation maximization (EM) algorithm for parameter estimation. Using simulated and benchmark data we show that funWeightClust outperforms funHDDC and several two-steps clustering methods. We also use funWeightClust to analyze traffic patterns in Edmonton, Canada.