AITopics | cd-fsod

Collaborating Authors

cd-fsod

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection

Neural Information Processing SystemsJun-14-2026, 05:36:44 GMT

Cross-Domain Few-Shot Object Detection (CD-FSOD) aims to detect novel objects with only a handful of labeled samples from previously unseen domains. While data augmentation and generative methods have shown promise in few-shot learning, their effectiveness for CD-FSOD remains unclear due to the need for both visual realism and domain alignment. Existing strategies, such as copy-paste augmentation and text-to-image generation, often fail to preserve the correct object category or produce backgrounds coherent with the target domain, making them non-trivial to apply directly to CD-FSOD. To address these challenges, we propose Domain-RAG, a training-free, retrieval-guided compositional image generation framework tailored for CD-FSOD. Domain-RAG consists of three stages: domain-aware background retrieval, domain-guided background generation, and foreground-background composition. Specifically, the input image is first decomposed into foreground and background regions.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.84)

Add feedback

CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion

Meng, Boyuan, Zhang, Xiaohan, Li, Peilin, Wu, Zhe, Li, Yiming, Zhao, Wenkai, Yu, Beinan, Shen, Hui-Liang

arXiv.org Artificial IntelligenceMay-5-2025

The object-background confusion refers to the confusion between expected objects and background. As illustrated in Figure 1(a), in underwater scenes, the boundaries between the target object and the background are often ambiguous, leading to missed detections. The object-object confusion refers to the confusion between different classes of objects. As illustrated in Figure 1(b), the similarity between different classes results in false detections. In the field of CD-FSOD, CD-ViTO [8] represents the state-of-the-art work, which devises various fine-tuning modules and achieves significant performance improvements. To address object-background confusion, CD-ViTO re-weights manually selected background features and combines them with object features in a weighted sum. However, manually designed features lack adaptability when the target domain distribution differs [4], [24]. To address object-object confusion, CD-ViTO [8] enhances class distinction by directly adjusting the support class features.

artificial intelligence, confusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.00938

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

NTIRE 2025 Challenge on Cross-Domain Few-Shot Object Detection: Methods and Results

Fu, Yuqian, Qiu, Xingyu, Ren, Bin, Fu, Yanwei, Timofte, Radu, Sebe, Nicu, Yang, Ming-Hsuan, Van Gool, Luc, Zhang, Kaijin, Nong, Qingpeng, Dong, Xiugang, Gao, Hong, Zhou, Xiangsheng, Pan, Jiancheng, Liu, Yanxing, He, Xiao, Li, Jiahao, Sun, Yuze, Huang, Xiaomeng, Zhang, Zhenyu, Ma, Ran, Liu, Yuhan, Zhuang, Zijian, Yi, Shuai, Zou, Yixiong, Hong, Lingyi, Chen, Mingxi, Li, Runze, Sheng, Xingdong, Zhang, Wenqiang, Chen, Weisen, Yan, Yongxin, Chen, Xinguo, Shao, Yuanjie, Zuo, Zhengrong, Sang, Nong, Wu, Hao, Sun, Haoran, Hu, Shuming, Zhang, Yan, Shi, Zhiguang, Zhang, Yu, Chen, Chao, Wang, Tao, Feng, Da, Zhuo, Linhai, Lin, Ziming, Huang, Yali, Me, Jie, Yang, Yiming, Guo, Mi, Jiu, Mingyuan, Xu, Mingliang, Xiong, Maomao, Zhang, Qunshu, Cao, Xinyu, Yang, Yuqing, Sheng, Dianmo, Zhao, Xuanpu, Li, Zhiyu, Ding, Xuyang, Li, Wenqian

arXiv.org Artificial IntelligenceApr-16-2025

Cross-Domain Few-Shot Object Detection (CD-FSOD) poses significant challenges to existing object detection and few-shot detection models when applied across domains. In conjunction with NTIRE 2025, we organized the 1st CD-FSOD Challenge, aiming to advance the performance of current object detectors on entirely novel target domains with only limited labeled data. The challenge attracted 152 registered participants, received submissions from 42 teams, and concluded with 13 teams making valid final submissions. Participants approached the task from diverse perspectives, proposing novel models that achieved new state-of-the-art (SOTA) results under both open-source and closed-source settings. In this report, we present an overview of the 1st NTIRE 2025 CD-FSOD Challenge, highlighting the proposed solutions and summarizing the results submitted by the participants.

artificial intelligence, computer vision and pattern recognition, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2504.10685

Country:

Europe (1.00)
Asia > China (0.67)
North America > United States (0.46)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.66)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Fu, Yuqian, Wang, Yu, Pan, Yixuan, Huai, Lian, Qiu, Xingyu, Shangguan, Zeyu, Liu, Tong, Kong, Lingjie, Fu, Yanwei, Van Gool, Luc, Jiang, Xingqun

arXiv.org Artificial IntelligenceFeb-5-2024

This paper addresses the challenge of cross-domain few-shot object detection (CD-FSOD), aiming to develop an accurate object detector for novel domains with minimal labeled examples. While transformer-based open-set detectors e.g., DE-ViT~\cite{zhang2023detect} have excelled in both open-vocabulary object detection and traditional few-shot object detection, detecting categories beyond those seen during training, we thus naturally raise two key questions: 1) can such open-set detection methods easily generalize to CD-FSOD? 2) If no, how to enhance the results of open-set methods when faced with significant domain gaps? To address the first question, we introduce several metrics to quantify domain variances and establish a new CD-FSOD benchmark with diverse domain metric values. Some State-Of-The-Art (SOTA) open-set object detection methods are evaluated on this benchmark, with evident performance degradation observed across out-of-domain datasets. This indicates the failure of adopting open-set detectors directly for CD-FSOD. Sequentially, to overcome the performance degradation issue and also to answer the second proposed question, we endeavor to enhance the vanilla DE-ViT. With several novel components including finetuning, a learnable prototype module, and a lightweight attention module, we present an improved Cross-Domain Vision Transformer for CD-FSOD (CD-ViTO). Experiments show that our CD-ViTO achieves impressive results on both out-of-domain and in-domain target datasets, establishing new SOTAs for both CD-FSOD and FSOD. All the datasets, codes, and models will be released to the community.

cd-fsod, dataset, module, (15 more...)

arXiv.org Artificial Intelligence

2402.03094

Country: Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback