unseen class
Beyond the Seen: Bounded Distribution Estimation for Open-Vocabulary Learning
Open-vocabulary learning requires modeling the data distribution in open environments, which consists of both seen-class and unseen-class data. Existing methods estimate the distribution in open environments using seen-class data, where the absence of unseen classes makes the estimation error inherently unidentifiable. Intuitively, learning beyond the seen classes is crucial for distribution estimation to bound the estimation error. We theoretically demonstrate that the distribution can be effectively estimated by generating unseen-class data, through which the estimation error is upper-bounded. Building on this theoretical insight, we propose a novel open-vocabulary learning method, which generates unseen-class data for estimating the distribution in open environments. The method consists of a class-domain-wise data generation pipeline and a distribution alignment algorithm. The data generation pipeline generates unseen-class data under the guidance of a hierarchical semantic tree and domain information inferred from the seen-class data, facilitating accurate distribution estimation. With the generated data, the distribution alignment algorithm estimates and maximizes the posterior probability to enhance generalization in open-vocabulary learning. Extensive experiments on 11datasets demonstrate that our method outperforms baseline approaches by up to 14%, highlighting its effectiveness and superiority.
OVSMeets Continual Learning: Towards Sustainable Open-Vocabulary Segmentation
Open-Vocabulary Segmentation (OVS) aims to segment classes that are not present in the training dataset. However, most existing studies assume that the training data is fixed in advance, overlooking more practical scenarios where new datasets are continuously collected over time. To address this, we first analyze how existing OVS models perform under such conditions. In this context, we explore several approaches such as retraining, fine-tuning, and continual learning but find that each of them has clear limitations. To address these issues, we propose ConOVS, a novel continual learning method based on a Mixture-of-Experts framework. ConOVS dynamically combines expert decoders based on the probability that an input sample belongs to the distribution of each incremental dataset. Through extensive experiments, we show that ConOVS consistently outperforms existing methods across pre-training, incremental, and zero-shot test datasets, effectively expanding the recognition capabilities of OVS models when data is collected sequentially.
How Classifier Features Transfer to Downstream: An Asymptotic Analysis in a Two-Layer Model
Neural networks learn effective feature representations, which can be transferred to new tasks without additional training. While larger datasets are known to improve feature transfer, the theoretical conditions for the success of such transfer remain unclear. This work investigates feature transfer in networks trained for classification to identify the conditions that enable effective clustering in unseen classes. We first reveal that higher similarity between training and unseen distributions leads to improved Cohesion and Separability. We then show that feature expressiveness is enhanced when inputs are similar to the training classes, while the features of irrelevant inputs remain indistinguishable.
0266e33d3f546cb5436a10798e657d97-AuthorFeedback.pdf
We thank the reviewers for their encouraging and constructive comments. We are pleased that they find the paper well1 written and acknowledge the novelty and originality of the proposed task, which "has a potential to spark interest"2 (R1) and "may lead to future papers studying it" (R2). Regarding the proposed framework, R1 and R2 not only find it3 "sound" and "novel" but also stress the "re-implementation ease" from which "practitioners may benefit" (R1). Still,4 the reviewers raise points of improvement (R1, R3) and suggest a discussion about a related task (R2). We carefully5 address these comments below.
TPR: Topology-Preserving Reservoirs for Generalized Zero-Shot Learning
Pre-trained vision-language models (VLMs) such as CLIP have shown excellent performance for zero-shot classification. Based on CLIP, recent methods design various learnable prompts to evaluate the zero-shot generalization capability on a base-to-novel setting. This setting assumes test samples are already divided into either base or novel classes, limiting its application to realistic scenarios. In this paper, we focus on a more challenging and practical setting: generalized zero-shot learning (GZSL), i.e., testing with no information about the base/novel division. To address this challenging zero-shot problem, we introduce two unique designs that enable us to classify an image without the need of knowing whether it comes from seen or unseen classes.
Robust Semi-Supervised Learning when Not All Classes have Labels
Semi-supervised learning (SSL) provides a powerful framework for leveraging unlabeled data. Existing SSL typically requires all classes have labels. However, in many real-world applications, there may exist some classes that are difficult to label or newly occurred classes that cannot be labeled in time, resulting in there are unseen classes in unlabeled data. Unseen classes will be misclassified as seen classes, causing poor classification performance. The performance of seen classes is also harmed by the existence of unseen classes.
OwMatch: Conditional Self-Labeling with Consistency for Open-World Semi-Supervised Learning
Semi-supervised learning (SSL) offers a robust framework for harnessing the potential of unannotated data. Traditionally, SSL mandates that all classes possess labeled instances. However, the emergence of open-world SSL (OwSSL) introduces a more practical challenge, wherein unlabeled data may encompass samples from unseen classes. This scenario leads to misclassification of unseen classes as known ones, consequently undermining classification accuracy. To overcome this challenge, this study revisits two methodologies from self-supervised and semi-supervised learning, self-labeling and consistency, tailoring them to address the OwSSL problem. Specifically, we propose an effective framework called, combining conditional self-labeling and open-world hierarchical thresholding. Theoretically, we analyze the estimation of class distribution on unlabeled data through rigorous statistical analysis, thus demonstrating that OwMatch can ensure the unbiasedness of the label assignment estimator with reliability. Comprehensive empirical analyses demonstrate that our method yields substantial performance enhancements across both known and unknown classes in comparison to previous studies. Code is available at https://github.com/niusj03/OwMatch .