Beyond BEV: Optimizing Point-Level Tokens for Collaborative Perception
Li, Yang, Yuan, Quan, Luo, Guiyang, Fu, Xiaoyuan, Pan, Rui, Yang, Yujia, Shao, Congzhang, Liu, Yuewen, Li, Jinglin
–arXiv.org Artificial Intelligence
Collaborative perception allows agents to enhance their perceptual capabilities by exchanging intermediate features. Existing methods typically organize these intermediate features as 2D bird's-eye-view (BEV) representations, which discard critical fine-grained 3D structural cues essential for accurate object recognition and localization. To this end, we first introduce point-level tokens as intermediate representations for collaborative perception. However, point-cloud data are inherently unordered, massive, and position-sensitive, making it challenging to produce compact and aligned point-level token sequences that preserve detailed structural information. Therefore, we present CoPLOT, a novel Collaborative perception framework that utilizes Point-Level Optimized Tokens. It incorporates a point-native processing pipeline, including token reordering, sequence modeling, and multi-agent spatial alignment. A semantic-aware token reordering module generates adaptive 1D reorderings by leveraging scene-level and token-level semantic information. A frequency-enhanced state space model captures long-range sequence dependencies across both spatial and spectral domains, improving the differentiation between foreground tokens and background clutter. Lastly, a neighbor-to-ego alignment module applies a closed-loop process, combining global agent-level correction with local token-level refinement to mitigate localization noise. Extensive experiments on both simulated and real-world datasets show that CoPLOT outperforms state-of-the-art models, with even lower communication and computation overhead. Code will be available at https://github.com/CheeryLeeyy/CoPLOT.
arXiv.org Artificial Intelligence
Aug-28-2025
- Country:
- Asia > China
- Europe > Austria
- Vienna (0.14)
- North America
- Canada > British Columbia
- Vancouver (0.04)
- United States
- California > San Diego County
- San Diego (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Tennessee > Davidson County
- Nashville (0.04)
- Washington > King County
- Seattle (0.04)
- California > San Diego County
- Canada > British Columbia
- Oceania
- Australia > Victoria
- Melbourne (0.04)
- New Zealand > North Island
- Auckland Region > Auckland (0.04)
- Australia > Victoria
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Information Technology (0.68)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks
- Deep Learning (0.46)
- Natural Language (1.00)
- Representation & Reasoning > Agents (1.00)
- Vision (1.00)
- Machine Learning > Neural Networks
- Information Technology > Artificial Intelligence