2 (a) Visual Domain View (RGB) (b) Spectral Domain View (MSI)
–Neural Information Processing Systems
Drone-based multi-object tracking is essential yet highly challenging due to small targets, severe occlusions, and cluttered backgrounds. Existing RGB-based multiobject tracking algorithms heavily depend on spatial appearance cues such as color and texture, which often degrade in aerial views, compromising tracking reliability. Multispectral imagery, capturing pixel-level spectral reflectance, provides crucial spectral cues that significantly enhance object discriminability under degraded spatial conditions. However, the lack of dedicated multispectral UAV datasets has hindered progress in this domain. To bridge this gap, we introduce MMOT, the first challenging benchmark for drone-based multispectral multi-object tracking dataset. It features three key characteristics: (i) Large Scale -- 125 video sequences with over 488.8K annotations across eight object categories; (ii) Comprehensive Challenges -- covering diverse real-world challenges such as extreme small targets, high-density scenarios, severe occlusions, and complex platform motion; and (iii) Precise Oriented Annotations -- enabling accurate localization and reduced object ambiguity under aerial perspectives.
Neural Information Processing Systems
Jun-22-2026, 22:02:06 GMT
- Country:
- Asia (0.28)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.92)
- Research Report
- Industry:
- Transportation (0.92)
- Information Technology > Security & Privacy (0.46)
- Technology: