SWA-SOP: Spatially-aware Window Attention for Semantic Occupancy Prediction in Autonomous Driving

Cao, Helin, Materla, Rafael, Behnke, Sven

arXiv.org Artificial Intelligence 

-- Perception systems in autonomous driving rely on sensors such as LiDAR and cameras to perceive the 3D environment. However, due to occlusions and data sparsity, these sensors often fail to capture complete information. Existing transformer-based SOP methods lack explicit modeling of spatial structure in attention computation, resulting in limited geometric awareness and poor performance in sparse or occluded areas. T o this end, we propose Spatially-aware Window Attention (SW A), a novel mechanism that incorporates local spatial context into attention. SW A significantly improves scene completion and achieves state-of-the-art results on LiDAR-based SOP benchmarks. We further validate its generality by integrating SW A into a camera-based SOP pipeline, where it also yields consistent gains across modalities. Autonomous vehicles rely on sensors such as LiDAR and cameras to perceive their surroundings.