Not enough data to create a plot.
Try a different view from the menu above.
Tishby, Naftali
Tight Sample Complexity of Large-Margin Learning
Sabato, Sivan, Srebro, Nathan, Tishby, Naftali
We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L_2 regularization: We introduce the \gamma-adapted-dimension, which is a simple function of the spectrum of a distribution's covariance matrix, and show distribution-specific upper and lower bounds on the sample complexity, both governed by the \gamma-adapted-dimension of the source distribution. We conclude that this new quantity tightly characterizes the true sample complexity of large-margin classification. The bounds hold for a rich family of sub-Gaussian distributions.
Tight Sample Complexity of Large-Margin Learning
Sabato, Sivan, Srebro, Nathan, Tishby, Naftali
We obtain a tight distribution-specific characterization of the sample complexity of large-margin classification with L2 regularization: We introduce the gamma-adapted-dimension, which is a simple function of the spectrum of a distribution's covariance matrix, and show distribution-specific upper and lower bounds on the sample complexity, both governed by the gamma-adapted-dimension of the source distribution. We conclude that this new quantity tightly characterizes the true sample complexity of large-margin classification. The bounds hold for a rich family of sub-Gaussian distributions.
Cluster Stability for Finite Samples
Shamir, Ohad, Tishby, Naftali
Over the past few years, the notion of stability in data clustering has received growing attention as a cluster validation criterion in a sample-based framework. However, recent work has shown that as the sample size increases, any clustering model will usually become asymptotically stable. This led to the conclusion that stability is lacking as a theoretical and practical tool. The discrepancy between this conclusion and the success of stability in practice has remained an open question, whichwe attempt to address. Our theoretical approach is that stability, as used by cluster validation algorithms, is similar in certain respects to measures of generalization in a model-selection framework. In such cases, the model chosen governsthe convergence rate of generalization bounds. By arguing that these rates are more important than the sample size, we are led to the prediction that stability-based cluster validation algorithms should not degrade with increasing sample size, despite the asymptotic universal stability. This prediction is substantiated bya theoretical analysis as well as some empirical results. We conclude that stability remains a meaningful cluster validation criterion over finite samples.
Information Bottleneck for Non Co-Occurrence Data
Seldin, Yevgeny, Slonim, Noam, Tishby, Naftali
We present a general model-independent approach to the analysis of data in cases when these data do not appear in the form of co-occurrence of two variables X,Y, but rather as a sample of values of an unknown (stochastic) function Z(X,Y). For example, in gene expression data, the expression level Z is a function of gene X and condition Y; or in movie ratings data the rating Z is a function of viewer X and movie Y . The approach represents a consistent extension of the Information Bottleneck method that has previously relied on the availability of co-occurrence statistics. By altering the relevance variable we eliminate the need in the sample of joint distribution of all input variables. This new formulation also enables simple MDL-like model complexity control and prediction of missing values of Z. The approach is analyzed and shown to be on a par with the best known clustering algorithms for a wide range of domains. For the prediction of missing values (collaborative filtering) it improves the currently best known results.
Generalization in Clustering with Unobserved Features
Krupka, Eyal, Tishby, Naftali
We argue that when objects are characterized by many attributes, clustering them on the basis of a relatively small random subset of these attributes can capture information on the unobserved attributes as well. Moreover, we show that under mild technical conditions, clustering the objects on the basis of such a random subset performs almost as well as clustering with the full attribute set. We prove a finite sample generalization theorems for this novel learning scheme that extends analogous results from the supervised learning setting. The scheme is demonstrated for collaborative filtering of users with movies rating as attributes.
Query by Committee Made Real
Gilad-bachrach, Ran, Navot, Amir, Tishby, Naftali
Training a learning algorithm is a costly task. A major goal of active learning is to reduce this cost. In this paper we introduce a new algorithm, KQBC, which is capable of actively learning large scale problems by using selective sampling. The algorithm overcomes the costly sampling step of the well known Query By Committee (QBC) algorithm by projecting onto a low dimensional space. KQBC also enables the use of kernels, providing a simple way of extending QBC to the nonlinear scenario. Sampling the low dimension space is done using the hit and run random walk. We demonstrate the success of this novel algorithm by applying it to both artificial and a real world problems.
Generalization in Clustering with Unobserved Features
Krupka, Eyal, Tishby, Naftali
We argue that when objects are characterized by many attributes, clustering themon the basis of a relatively small random subset of these attributes can capture information on the unobserved attributes as well. Moreover, we show that under mild technical conditions, clustering the objects on the basis of such a random subset performs almost as well as clustering with the full attribute set. We prove a finite sample generalization theoremsfor this novel learning scheme that extends analogous results from the supervised learning setting. The scheme is demonstrated for collaborative filtering of users with movies rating as attributes.
Nearest Neighbor Based Feature Selection for Regression and its Application to Neural Activity
Navot, Amir, Shpigelman, Lavi, Tishby, Naftali, Vaadia, Eilon
We present a nonlinear, simple, yet effective, feature subset selection method for regression and use it in analyzing cortical neural activity. Our algorithm involves a feature-weighted version of the k-nearest-neighbor algorithm. It is able to capture complex dependency of the target function onits input and makes use of the leave-one-out error as a natural regularization. We explain the characteristics of our algorithm on synthetic problemsand use it in the context of predicting hand velocity from spikes recorded in motor cortex of a behaving monkey. By applying feature selectionwe are able to improve prediction quality and suggest a novel way of exploring neural data.
Query by Committee Made Real
Gilad-bachrach, Ran, Navot, Amir, Tishby, Naftali
Training a learning algorithm is a costly task. A major goal of active learning is to reduce this cost. In this paper we introduce a new algorithm, KQBC,which is capable of actively learning large scale problems by using selective sampling. The algorithm overcomes the costly sampling stepof the well known Query By Committee (QBC) algorithm by projecting onto a low dimensional space. KQBC also enables the use of kernels, providing a simple way of extending QBC to the nonlinear scenario. Sampling the low dimension space is done using the hit and run random walk. We demonstrate the success of this novel algorithm by applying it to both artificial and a real world problems.
Euclidean Embedding of Co-Occurrence Data
Globerson, Amir, Chechik, Gal, Pereira, Fernando, Tishby, Naftali
Embedding algorithms search for low dimensional structure in complex data, but most algorithms only handle objects of a single type for which pairwise distances are specified. This paper describes a method for embedding objectsof different types, such as images and text, into a single common Euclidean space based on their co-occurrence statistics. The joint distributions are modeled as exponentials of Euclidean distances in the low-dimensional embedding space, which links the problem to convex optimizationover positive semidefinite matrices.