Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection
Zhang, Shan, Murray, Naila, Wang, Lei, Koniusz, Piotr
–arXiv.org Artificial Intelligence
In this paper, we tackle the challenging problem of Few-shot Object Detection. Existing FSOD pipelines (i) use average-pooled representations that result in information loss; and/or (ii) discard position information that can help detect object instances. Consequently, such pipelines are sensitive to large intra-class appearance and geometric variations between support and query images. To address these drawbacks, we propose a Time-rEversed diffusioN tEnsor Transformer (TENET), which i) forms high-order tensor representations that capture multi-way feature occurrences that are highly discriminative, and ii) uses a transformer that dynamically extracts correlations between the query image and the entire support set, instead of a single average-pooled support embedding. We also propose a Transformer Relation Head (TRH), equipped with higher-order representations, which encodes correlations between query regions and the entire support set, while being sensitive to the positional variability of object instances. Our model achieves state-of-the-art results on PASCAL VOC, FSOD, and COCO.
arXiv.org Artificial Intelligence
Oct-30-2022
- Country:
- Asia > South Korea
- Europe
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- California > Los Angeles County
- Long Beach (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- California > Los Angeles County
- Canada > Quebec
- South America > Chile
- Genre:
- Research Report (0.40)
- Technology: