Occlusion-Aware Diffusion Model for Pedestrian Intention Prediction

Liu, Yu, Liu, Zhijie, Yang, Zedong, Li, You-Fu, Kong, He

Nov-4-2025–arXiv.org Artificial Intelligence

Abstract--Predicting pedestrian crossing intentions is crucial for the navigation of mobile robots and intelligent vehicles. Although recent deep learning-based models have shown significant success in forecasting intentions, few consider incomplete observation under occlusion scenarios. T o tackle this challenge, we propose an Occlusion-A ware Diffusion Model (ODM) that reconstructs occluded motion patterns and leverages them to guide future intention prediction. During the denoising stage, we introduce an occlusion-aware diffusion transformer architecture to estimate noise features associated with occluded patterns, thereby enhancing the model's ability to capture contextual relationships in occluded semantic scenarios. Furthermore, an occlusion mask-guided reverse process is introduced to effectively utilize observation information, reducing the accumulation of prediction errors and enhancing the accuracy of reconstructed motion features. The performance of the proposed method under various occlusion scenarios is comprehensively evaluated and compared with existing methods on popular benchmarks, namely PIE and JAAD. Extensive experimental results demonstrate that the proposed method achieves more robust performance than existing methods in the literature. ITH the rapid advancement of intelligent sensing and computing technologies, much progress has been made in recent years in developing autonomous vehicles to enhance traffic efficiency and road safety. To prevent collisions, path planning of autonomous vehicles [1], [2] is essential, requiring an understanding of interactions between road users and the ability to forecast their potential actions [3]-[5]. This manuscript has been accepted to the IEEE Transactions on Intelligent Transportation Systems as a regular paper. Y u Liu is also with the Department of Mechanical Engineering, City University of Hong Kong, Hong Kong SAR, China. Y ou-Fu Li is with the Department of Mechanical Engineering, City University of Hong Kong, Hong Kong SAR, China. The typical scenario of visual occlusion is illustrated here. Solid green lines represent the parts of the observation that are within the field of view and visible, while dashed red lines indicate positional features that are undetectable due to occlusion.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Artificial Intelligence

Nov-4-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China > Hong Kong (0.85)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Transportation
  - Ground > Road (1.00)
  - Infrastructure & Services (0.87)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found