Goto

Collaborating Authors

 Inductive Learning


Structure Regularization for Structured Prediction

Neural Information Processing Systems

While there are many studies on weight regularization, the study on structure regularization is rare. Many existing systems on structured prediction focus on increasing the level of structural dependencies within the model. However, this trend could have been misdirected, because our study suggests that complex structures are actually harmful to generalization ability in structured prediction. To control structure-based overfitting, we propose a structure regularization framework via structure decomposition, which decomposes training samples into mini-samples with simpler structures, deriving a model with better generalization power. We show both theoretically and empirically that structure regularization can effectively control overfitting risk and lead to better accuracy. As a by-product, the proposed method can also substantially accelerate the training speed. The method and the theoretical results can apply to general graphical models with arbitrary structures. Experiments on well-known tasks demonstrate that our method can easily beat the benchmark systems on those highly-competitive tasks, achieving record-breaking accuracies yet with substantially faster training speed.


Object Localization based on Structural SVM using Privileged Information

Neural Information Processing Systems

We propose a structured prediction algorithm for object localization based on Support Vector Machines (SVMs) using privileged information. Privileged information provides useful high-level knowledge for image understanding and facilitates learning a reliable model even with a small number of training examples. In our setting, we assume that such information is available only at training time since it may be difficult to obtain from visual data accurately without human supervision. Our goal is to improve performance by incorporating privileged information into ordinary learning framework and adjusting model parameters for better generalization. We tackle object localization problem based on a novel structural SVM using privileged information, where an alternating loss-augmented inference procedure is employed to handle the term in the objective function corresponding to privileged information. We apply the proposed algorithm to the Caltech-UCSD Birds 200-2011 dataset, and obtain encouraging results suggesting further investigation into the benefit of privileged information in structured prediction.


Top Rank Optimization in Linear Time

Neural Information Processing Systems

Bipartite ranking aims to learn a real-valued ranking function that orders positive instances before negative instances. Recent efforts of bipartite ranking are focused on optimizing ranking accuracy at the top of the ranked list. Most existing approaches are either to optimize task specific metrics or to extend the rank loss by emphasizing more on the error associated with the top ranked instances, leading to a high computational cost that is super-linear in the number of training instances. We propose a highly efficient approach, titled TopPush, for optimizing accuracy at the top that has computational complexity linear in the number of training instances. We present a novel analysis that bounds the generalization error for the top ranked instances for the proposed approach. Empirical study shows that the proposed approach is highly competitive to the state-of-the-art approaches and is 10-100 times faster.



Learning with Fredholm Kernels

Neural Information Processing Systems

In this paper we propose a framework for supervised and semi-supervised learning based on reformulating the learning problem as a regularized Fredholm integral equation. Our approach fits naturally into the kernel framework and can be interpreted as constructing new data-dependent kernels, which we call Fredholm kernels. We proceed to discuss the "noise assumption" for semi-supervised learning and provide both theoretical and experimental evidence that Fredholm kernels can effectively utilize unlabeled data under the noise assumption. We demonstrate that methods based on Fredholm learning show very competitive performance in the standard semi-supervised learning setting.


Zero-shot recognition with unreliable attributes

Neural Information Processing Systems

In principle, zero-shot learning makes it possible to train a recognition model simply by specifying the category's attributes. For example, with classifiers for generic attributes like striped and four-legged, one can construct a classifier for the zebra category by enumerating which properties it possesses--even without providing zebra training images. In practice, however, the standard zero-shot paradigm suffers because attribute predictions in novel images are hard to get right. We propose a novel random forest approach to train zero-shot models that explicitly accounts for the unreliability of attribute predictions. By leveraging statistics about each attribute's error tendencies, our method obtains more robust discriminative models for the unseen classes. We further devise extensions to handle the few-shot scenario and unreliable attribute descriptions. On three datasets, we demonstrate the benefit for visual category learning with zero or few training examples, a critical domain for rare categories or categories defined on the fly.


Weakly-supervised Discovery of Visual Pattern Configurations

Neural Information Processing Systems

The prominence of weakly labeled data gives rise to a growing demand for object detection methods that can cope with minimal supervision. We propose an approach that automatically identifies discriminative configurations of visual patterns that are characteristic of a given object class. We formulate the problem as a constrained submodular optimization problem and demonstrate the benefits of the discovered configurations in remedying mislocalizations and finding informative positive and negative training examples.


Review for NeurIPS paper: Graph Random Neural Networks for Semi-Supervised Learning on Graphs

Neural Information Processing Systems

Weaknesses: The proposed methods are not that novel. More specifically: (1) It seems that the consistency regularization is a general framework that can combine with other data augmentation methods, such as dropedge, and sampling algorithms. It would be better if the authors can also try these combinations, instead of only adopting their proposed dropnode augmentation. Thus, it would be better if the authors can provide a curve showing the performance of the proposed framework against other baselines under different training data percentage. Also, better to combine these methods with some advanced base GNN.



Review for NeurIPS paper: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning

Neural Information Processing Systems

Weaknesses: -As mentioned in the paper, the proposed method has a trivial solution, that both models output 0's. To me, the method is too simple to be true. I tried to reimplement it, but no success. It is highly recommend to opensource the code for reproduceable research. How can you learn detection with frozen representation? Please use the standard settings, e.g. as in MoCo.