Bar-hillel, Aharon
Spike Sorting: Bayesian Clustering of Non-Stationary Data
Bar-hillel, Aharon, Spiro, Adam, Stark, Eran
Spike sorting involves clustering spike trains recorded by a microelectrode according to the source neuron. It is a complicated problem, which requires a lot of human labor, partly due to the non-stationary nature of the data. We propose an automated technique for the clustering of non-stationary Gaussian sources in a Bayesian framework. At a first search stage, data is divided into short time frames and candidate descriptions of the data as a mixture of Gaussians are computed for each frame. At a second stage transition probabilities between candidate mixtures are computed, and a globally optimal clustering is found as the MAP solution of the resulting probabilistic model. Transition probabilities are computed using local stationarity assumptions and are based on a Gaussian version of the Jensen-Shannon divergence. The method was applied to several recordings. The performance appeared almost indistinguishable from humans in a wide range of scenarios, including movement, merges, and splits of clusters.
Spike Sorting: Bayesian Clustering of Non-Stationary Data
Bar-hillel, Aharon, Spiro, Adam, Stark, Eran
Computing Gaussian Mixture Models with EM Using Equivalence Constraints
Shental, Noam, Bar-hillel, Aharon, Hertz, Tomer, Weinshall, Daphna
Density estimation with Gaussian Mixture Models is a popular generative techniqueused also for clustering. We develop a framework to incorporate side information in the form of equivalence constraints into the model estimation procedure. Equivalence constraints are defined on pairs of data points, indicating whether the points arise from the same source (positive constraints) or from different sources (negative constraints). Suchconstraints can be gathered automatically in some learning problems, and are a natural form of supervision in others. For the estimation of model parameters we present a closed form EM procedure which handles positive constraints, and a Generalized EM procedure using aMarkov net which handles negative constraints. Using publicly available data sets we demonstrate that such side information can lead to considerable improvement in clustering tasks, and that our algorithm is preferable to two other suggested methods using the same type of side information.