Goto

Collaborating Authors

 generalized zero-shot learning


Generalized Zero-Shot Learning with Deep Calibration Network

Neural Information Processing Systems

A technical challenge of deep learning is recognizing target classes without seen data. Zero-shot learning leverages semantic representations such as attributes or class prototypes to bridge source and target classes. Existing standard zero-shot learning methods may be prone to overfitting the seen data of source classes as they are blind to the semantic representations of target classes. In this paper, we study generalized zero-shot learning that assumes accessible to target classes for unseen data during training, and prediction on unseen data is made by searching on both source and target classes. We propose a novel Deep Calibration Network (DCN) approach towards this generalized zero-shot learning paradigm, which enables simultaneous calibration of deep networks on the confidence of source classes and uncertainty of target classes. Our approach maps visual features of images and semantic representations of class prototypes to a common embedding space such that the compatibility of seen data to both source and target classes are maximized. We show superior accuracy of our approach over the state of the art on benchmark datasets for generalized zero-shot learning, including AwA, CUB, SUN, and aPY.




Supplementary for Dual Progressive Prototype Network for Generalized Zero-Shot Learning

Neural Information Processing Systems

Since some brand-new methods utilize post-processing, such as calibration stacking [5] or domain detector [2, 12], to alleviate the domain shift problem, we report the results of our Dual Progressive Prototype Network (DPPN) with post-processing in Table 3 of the main paper for fair comparison. In this part, we further compare our DPPN with recent methods that clearly report their results without post-processing, of which the comparison results are shown in Table 1. APN [15] only reports their results with calibration stacking. Our DPPN outperforms the best one by respectively 15. 3%, 8. 8%, and 7. 3% for H on CUB, AWA2, and aPY datasets, and obtains comparable performance on SUN dataset. This demonstrates the effectiveness of learning representations that progressively explore category discrimination and attribute-region correspondence.


Dual Adversarial Semantics-Consistent Network for Generalized Zero-Shot Learning

Neural Information Processing Systems

Generalized zero-shot learning (GZSL) is a challenging class of vision and knowledge transfer problems in which both seen and unseen classes appear during testing. Existing GZSL approaches either suffer from semantic loss and discard discriminative information at the embedding stage, or cannot guarantee the visual-semantic interactions. To address these limitations, we propose a Dual Adversarial Semantics-Consistent Network (referred to as DASCN), which learns both primal and dual Generative Adversarial Networks (GANs) in a unified framework for GZSL. In DASCN, the primal GAN learns to synthesize inter-class discriminative and semantics-preserving visual features from both the semantic representations of seen/unseen classes and the ones reconstructed by the dual GAN. The dual GAN enforces the synthetic visual features to represent prior semantic knowledge well via semantics-consistent adversarial learning. To the best of our knowledge, this is the first work that employs a novel dual-GAN mechanism for GZSL. Extensive experiments show that our approach achieves significant improvements over the state-of-the-art approaches.


Dual Progressive Prototype Network for Generalized Zero-Shot Learning

Neural Information Processing Systems

Generalized Zero-Shot Learning (GZSL) aims to recognize new categories with auxiliary semantic information, e.g., category attributes. In this paper, we handle the critical issue of domain shift problem, i.e., confusion between seen and unseen categories, by progressively improving cross-domain transferability and category discriminability of visual representations. Our approach, named Dual Progressive Prototype Network (DPPN), constructs two types of prototypes that record prototypical visual patterns for attributes and categories, respectively. With attribute prototypes, DPPN alternately searches attribute-related local regions and updates corresponding attribute prototypes to progressively explore accurate attribute-region correspondence. This enables DPPN to produce visual representations with accurate attribute localization ability, which benefits the semantic-visual alignment and representation transferability. Besides, along with progressive attribute localization, DPPN further projects category prototypes into multiple spaces to progressively repel visual representations from different categories, which boosts category discriminability. Both attribute and category prototypes are collaboratively learned in a unified framework, which makes visual representations of DPPN transferable and distinctive.Experiments on four benchmarks prove that DPPN effectively alleviates the domain shift problem in GZSL.


Generalized Zero-Shot Learning with Deep Calibration Network

Neural Information Processing Systems

A technical challenge of deep learning is recognizing target classes without seen data. Zero-shot learning leverages semantic representations such as attributes or class prototypes to bridge source and target classes. Existing standard zero-shot learning methods may be prone to overfitting the seen data of source classes as they are blind to the semantic representations of target classes. In this paper, we study generalized zero-shot learning that assumes accessible to target classes for unseen data during training, and prediction on unseen data is made by searching on both source and target classes. We propose a novel Deep Calibration Network (DCN) approach towards this generalized zero-shot learning paradigm, which enables simultaneous calibration of deep networks on the confidence of source classes and uncertainty of target classes. Our approach maps visual features of images and semantic representations of class prototypes to a common embedding space such that the compatibility of seen data to both source and target classes are maximized. We show superior accuracy of our approach over the state of the art on benchmark datasets for generalized zero-shot learning, including AwA, CUB, SUN, and aPY.




Supplementary for Dual Progressive Prototype Network for Generalized Zero-Shot Learning

Neural Information Processing Systems

Notably, the performance of DPPN on CZSL is not as impressive as in GZSL. The best result is bolded. From the results, our DPPN outperforms the best previous method by respectively 3.8%, 6.7%, and 2.9% We adopt a two-step training schedule that frst trains DPPN with the fxed ResNet-101 backbone and then fne-tunes the whole network. The best result is bolded. Since the representation derives from the preceding representation, the preceding representations bring limited supplement to the fnal performance.