Goto

Collaborating Authors

 spectral structure


Spectral structure learning for clinical time series

arXiv.org Artificial Intelligence

We develop and evaluate a structure learning algorithm for clinical time series. Clinical time series are multivariate time series observed in multiple patients and irregularly sampled, challenging existing structure learning algorithms. We assume that our times series are realizations of StructGP, a k-dimensional multi-output or multi-task stationary Gaussian process (GP), with independent patients sharing the same covariance function. StructGP encodes ordered conditional relations between time series, represented in a directed acyclic graph. We implement an adapted NOTEARS algorithm, which based on a differentiable definition of acyclicity, recovers the graph by solving a series of continuous optimization problems. Simulation results show that up to mean degree 3 and 20 tasks, we reach a median recall of 0.93% [IQR, 0.86, 0.97] while keeping a median precision of 0.71% [0.57-0.84], for recovering directed edges. We further show that the regularization path is key to identifying the graph. With StructGP, we proposed a model of time series dependencies, that flexibly adapt to different time series regularity, while enabling us to learn these dependencies from observations.


Sparsistent filtering of comovement networks from high-dimensional data

arXiv.org Machine Learning

Network representation of large dimensional complex systems has become a standard methodology to delineate the nature of linkages across a large number of constituent entities comprising the systems [33]. Examples range across systems varying widely in terms of nature and architecture: economic and financial networks [9, 5], social networks [44], biological networks like food webs [45], technological networks like world wide web [20] and transportation networks [40] among many others. Broadly speaking, there are two major strands of literature that starts from the analysis of the realized network. One strand of the literature utilizes networks to explore dynamics on it [32], using the realized network as the true representation of the linkages. The other literature goes backward to extract true linkages from the realized linkages [4, 36], maintaining the idea that some of the realized linkages in fact might be spurious. We are interested in the second stream of literature where the fundamental objective is to isolate and filter the key subnetwork out of a large dimensional realized network.


Model Selection for Topic Models via Spectral Decomposition

arXiv.org Machine Learning

Topic models have achieved significant successes in analyzing large-scale text corpus. In practical applications, we are always confronted with the challenge of model selection, i.e., how to appropriately set the number of topics. Following recent advances in topic model inference via tensor decomposition, we make a first attempt to provide theoretical analysis on model selection in latent Dirichlet allocation. Under mild conditions, we derive the upper bound and lower bound on the number of topics given a text collection of finite size. Experimental results demonstrate that our bounds are accurate and tight. Furthermore, using Gaussian mixture model as an example, we show that our methodology can be easily generalized to model selection analysis for other latent models.