Goto

Collaborating Authors

 Chan, Rosa H. M.


Reduced-Rank Linear Dynamical Systems

AAAI Conferences

Linear Dynamical Systems are widely used to study the underlying patterns of multivariate time series. A basic assumption of these models is that high-dimensional time series can be characterized by some underlying, low-dimensional and time-varying latent states. However, existing approaches to LDS modeling mostly learn the latent space with a prescribed dimensionality. When dealing with short-length high- dimensional time series data, such models would be easily overfitted. We propose Reduced-Rank Linear Dynamical Systems (RRLDS), to automatically retrieve the intrinsic dimensionality of the latent space during model learning. Our key observation is that the rank of the dynamics matrix of LDS captures the intrinsic dimensionality, and the variational inference with a reduced-rank regularization finally leads to a concise, structured, and interpretable latent space. To enable our method to handle count-valued data, we introduce the dispersion-adaptive distribution to accommodate over-/ equal-/ and under-dispersion nature of such data. Results on both simulated and experimental data demonstrate our model can robustly learn latent space from short-length, noisy, count-valued data and significantly improve the prediction performance over the state-of-the-art methods.


Modeling Short Over-Dispersed Spike Count Data: A Hierarchical Parametric Empirical Bayes Framework

arXiv.org Machine Learning

In this letter, a Hierarchical Parametric Empirical Bayes model is proposed to model spike count data. We have integrated Generalized Linear Models (GLMs) and empirical Bayes theory to simultaneously provide three advantages: (1) a model of over-dispersion of spike count values; (2) reduced MSE in estimation when compared to using the maximum likelihood method for GLMs; and (3) an efficient alternative to inference with fully Bayes estimators. We apply the model to study both simulated data and experimental neural data from the retina. The simulation results indicate that the new model can estimate both the weights of connections among neural populations and the output firing rates (mean spike count) efficiently and accurately. The results from the retinal datasets show that the proposed model outperforms both standard Poisson and Negative Binomial GLMs in terms of the prediction log-likelihood of held-out datasets.


Mutual Information-Based Unsupervised Feature Transformation for Heterogeneous Feature Subset Selection

arXiv.org Machine Learning

Conventional mutual information (MI) based feature selection (FS) methods are unable to handle heterogeneous feature subset selection properly because of data format differences or estimation methods of MI between feature subset and class label. A way to solve this problem is feature transformation (FT). In this study, a novel unsupervised feature transformation (UFT) which can transform non-numerical features into numerical features is developed and tested. The UFT process is MI-based and independent of class label. MI-based FS algorithms, such as Parzen window feature selector (PWFS), minimum redundancy maximum relevance feature selection (mRMR), and normalized MI feature selection (NMIFS), can all adopt UFT for pre-processing of non-numerical features. Unlike traditional FT methods, the proposed UFT is unbiased while PWFS is utilized to its full advantage. Simulations and analyses of large-scale datasets showed that feature subset selected by the integrated method, UFT-PWFS, outperformed other FT-FS integrated methods in classification accuracy.