Case-Based Reasoning
Revised Conditional t-SNE: Looking Beyond the Nearest Neighbors
Heiter, Edith, Kang, Bo, Seurinck, Ruth, Lijffijt, Jefrey
Conditional t-SNE (ct-SNE) is a recent extension to t-SNE that allows removal of known cluster information from the embedding, to obtain a visualization revealing structure beyond label information. This is useful, for example, when one wants to factor out unwanted differences between a set of classes. We show that ct-SNE fails in many realistic settings, namely if the data is well clustered over the labels in the original high-dimensional space. We introduce a revised method by conditioning the high-dimensional similarities instead of the low-dimensional similarities and storing within- and across-label nearest neighbors separately. This also enables the use of recently proposed speedups for t-SNE, improving the scalability. From experiments on synthetic data, we find that our proposed method resolves the considered problems and improves the embedding quality. On real data containing batch effects, the expected improvement is not always there. We argue revised ct-SNE is preferable overall, given its improved scalability. The results also highlight new open questions, such as how to handle distance variations between clusters.
Nearest-Neighbor Sampling Based Conditional Independence Testing
Li, Shuai, Chen, Ziqi, Zhu, Hongtu, Wang, Christina Dan, Wen, Wang
The conditional randomization test (CRT) was recently proposed to test whether two random variables X and Y are conditionally independent given random variables Z. The CRT assumes that the conditional distribution of X given Z is known under the null hypothesis and then it is compared to the distribution of the observed samples of the original data. The aim of this paper is to develop a novel alternative of CRT by using nearest-neighbor sampling without assuming the exact form of the distribution of X given Z. Specifically, we utilize the computationally efficient 1-nearest-neighbor to approximate the conditional distribution that encodes the null hypothesis. Then, theoretically, we show that the distribution of the generated samples is very close to the true conditional distribution in terms of total variation distance. Furthermore, we take the classifier-based conditional mutual information estimator as our test statistic. The test statistic as an empirical fundamental information theoretic quantity is able to well capture the conditional-dependence feature. We show that our proposed test is computationally very fast, while controlling type I and II errors quite well. Finally, we demonstrate the efficiency of our proposed test in both synthetic and real data analyses.
Asymptotic slowing down of the nearest-neighbor classifier
Although the value of the coefficient a depends upon the underlying probability distributions, the exponent of M is largely distri(cid:173) bution free. We thus obtain a concise relation between a classifier's ability to generalize from a finite reference sample and the dimensionality of the feature space, as well as an analytic validation of Bellman's well known "curse of dimensionality."
Discriminant Adaptive Nearest Neighbor Classification and Regression
Nearest neighbor classification expects the class conditional prob(cid:173) abilities to be locally constant, and suffers from bias in high di(cid:173) mensions We propose a locally adaptive form of nearest neighbor classification to try to finesse this curse of dimensionality. We use a local linear discriminant analysis to estimate an effective met(cid:173) ric for computing neighborhoods. We determine the local decision boundaries from centroid information, and then shrink neighbor(cid:173) hoods in directions orthogonal to these local decision boundaries, and elongate them parallel to the boundaries. Thereafter, any neighborhood-based classifier can be employed, using the modified neighborhoods. We also propose a method for global dimension reduction, that combines local dimension information.
Analog VLSI Model of Intersegmental Coordination with Nearest-Neighbor Coupling
We have a developed an analog VLSI system that models the coordina(cid:173) tion of neurobiological segmental oscillators. We have implemented and tested a system that consists of a chain of eleven pattern generating cir(cid:173) cuits that are synaptically coupled to their nearest neighbors. Each pat(cid:173) tern generating circuit is implemented with two silicon Morris-Lecar neurons that are connected in a reciprocally inhibitory network. We dis(cid:173) cuss the mechanisms of oscillations in the two-cell network and explore system behavior based on isotropic and anisotropic coupling, and fre(cid:173) quency gradients along the chain of oscillators.
An Incremental Nearest Neighbor Algorithm with Queries
We consider the general problem of learning multi-category classifi(cid:173) cation from labeled examples. We present experimental results for a nearest neighbor algorithm which actively selects samples from different pattern classes according to a querying rule instead of the a priori class probabilities. The amount of improvement of this query-based approach over the passive batch approach depends on the complexity of the Bayes rule. The principle on which this al(cid:173) gorithm is based is general enough to be used in any learning algo(cid:173) rithm which permits a model-selection criterion and for which the error rate of the classifier is calculable in terms of the complexity of the model.
An Investigation of Practical Approximate Nearest Neighbor Algorithms
This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimen- sional perception areas such as computer vision, with dozens of publica- tions in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hash- ing (LSH). In this paper we ask the question: can earlier spatial data structure approaches to exact nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We show why these structures should be able to exploit the same random- projection-based approximations that LSH enjoys, but with a simpler al- gorithm and perhaps with greater efficiency.
An Analog Visual Pre-Processing Processor Employing Cyclic Line Access in Only-Nearest-Neighbor-Interconnects Architecture
An analog focal-plane processor having a 128 128 photodiode array has been developed for directional edge filtering. It can perform 4 4-pixel kernel convolution for entire pixels only with 256 steps of simple ana- log processing. Newly developed cyclic line access and row-parallel processing scheme in conjunction with the "only-nearest-neighbor in- terconnects" architecture has enabled a very simple implementation. A proof-of-concept chip was fabricated in a 0.35-(cid:0)m 2-poly 3-metal CMOS technology and the edge filtering at a rate of 200 frames/sec.
Nearest Neighbor Based Feature Selection for Regression and its Application to Neural Activity
We present a non-linear, simple, yet effective, feature subset selection method for regression and use it in analyzing cortical neural activity. Our algorithm involves a feature-weighted version of the k-nearest-neighbor algorithm. It is able to capture complex dependency of the target func- tion on its input and makes use of the leave-one-out error as a natural regularization. We explain the characteristics of our algorithm on syn- thetic problems and use it in the context of predicting hand velocity from spikes recorded in motor cortex of a behaving monkey. By applying fea- ture selection we are able to improve prediction quality and suggest a novel way of exploring neural data.
Nearest-Neighbor-Based Active Learning for Rare Category Detection
Rare category detection is an open challenge for active learning, especially in the de-novo case (no labeled examples), but of significant practical importance for data mining - e.g. This paper develops a new method for detecting an instance of each minority class via an unsupervised local-density-differential sampling strategy. Essentially a variable-scale nearest neighbor process is used to optimize the probability of sampling tightly-grouped minority classes, subject to a local smoothness assumption of the majority class. Results on both synthetic and real data sets are very positive, detecting each minority class with only a frac- tion of the actively sampled points required by random sampling and by Pelleg's Interleave method, the prior best technique in the sparse literature on this topic.