Performance Analysis
Robust Bloom Filters for Large MultiLabel Classification Tasks
Moustapha M. Cisse, Nicolas Usunier, Thierry Artières, Patrick Gallinari
This paper presents an approach to multilabel classification (MLC) with a large number of labels. Our approach is a reduction to binary classification in which label sets are represented by low dimensional binary vectors. This representation follows the principle of Bloom filters, a space-efficient data structure originally designed for approximate membership testing. We show that a naive application of Bloom filters in MLC is not robust to individual binary classifiers' errors. We then present an approach that exploits a specific feature of real-world datasets when the number of labels is large: many labels (almost) never appear together. Our approach is provably robust, has sublinear training and inference complexity with respect to the number of labels, and compares favorably to state-of-the-art algorithms on two large scale multilabel datasets.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
"NIPS Neural Information Processing Systems 8-11th December 2014, Montreal, Canada",,, "Paper ID:","1807" "Title:","Zero-shot recognition with unreliable attributes" Current Reviews First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The paper strives to bridge the gap between the theory and practice of attribute-based zero-shot learning. The theory is that novel classes can be recognized automatically using pre-trained attribute predictors; in practice, however, learning these attribute classifiers can be as difficult or even more so than learning the object classes themselves. Random forests are trained to predict unseen classes from attribute vectors, and the training procedure takes into account the reliability of the attribute detectors by propagating a validation set through each decision tree at training time. The authors show how the method can be extended to handle training with a few training examples of test categories.
Export Reviews, Discussions, Author Feedback and Meta-Reviews
For [23] we refer to the paper pointed by you. 5. To Reviewer_25: the overall conceptual motivation for the paper is somewhat weak... Nystrom approximation can be used to approximate the kernel matrix and speed up kernel machines, and from Table 1 we can see that the performance is suboptimal even when rank=200 (see the 5-th column). In this case, it requires 200 inner product computations to make one prediction, which is too slow for many real-time systems (e.g., web applications, robotic applications ...). Therefore state-of-the-art Nystrom method is not good enough, and we reduce the prediction time to 10~20 inner products with a better classification accuracy, which is a big improvement. Also, as we mentioned in the point 1 above, although we want to optimize the prediction time, our method still has fast training time. We agree that the psuedo landmark point technique can be potentially applied to speed up the training time, and it is an interesting research direction.