SKGE-SWIN: End-To-End Autonomous Vehicle Waypoint Prediction and Navigation Using Skip Stage Swin Transformer

Kartiman, Fachri Najm Noer, Rasim, null, Wihardi, Yaya, Hasanah, Nurul, Natan, Oskar, Wahono, Bambang, Salim, Taufik Ibnu

arXiv.org Artificial Intelligence 

Abstract--Focusing on the development of an end-to-end autonomous vehicle model with pixel-to-pixel context awareness, this research proposes the SKGE-Swin architecture. This architecture utilizes the Swin Transformer with a skip-stage mechanism to broaden feature representation globally and at various network levels. This approach enables the model to extract information from distant pixels by leveraging the Swin Transformer's Shifted Window-based Multi-head Self-Attention (SW-MSA) mechanism and to retain critical information from the initial to the final stages of feature extraction, thereby enhancing its capability to comprehend complex patterns in the vehicle's surroundings. The model is evaluated on the CARLA platform using adversarial scenarios to simulate real-world conditions. Experimental results demonstrate that the SKGE-Swin architecture achieves a superior Driving Score compared to previous methods. Furthermore, an ablation study will be conducted to evaluate the contribution of each architectural component, including the influence of skip connections and the use of the Swin Transformer, in improving model performance. Index T erms--multitask learning, autonomous driving, end-to-end learning, skip connections, swin transformer, self-attention mechanism. I. Introduction Autonomous Driving is a complex intelligent system that handles tasks ranging from perception to vehicle control, necessitating distinct modules [1]. The conventional integration of these modules, however, is often intricate and inefficient. Fachri Najm Noer Kartiman is with Department of Computer Science, Indonesia University of Education, Bandung 40154, Indonesia (e-mail: fachri-najmnoer@upi.edu). Rasim is with Department of Computer Science, Indonesia University of Education, Bandung 40154, Indonesia (e-mail: rasim@upi.edu). Y aya Wihardi is with Department of Computer Science, Indonesia University of Education, Bandung 40154, Indonesia (e-mail: yayawihardi@upi.edu). Nurul Hasanah is with Research Center for Smart Mechatronics, National Research and Innovation Agency, Bandung 40135, Indonesia (email:nuru030@brin.go.id). Bambang Wahono is with Research Center for Smart Mechatronics, National Research and Innovation Agency, Bandung 40135, Indonesia (email:bamb047@brin.go.id). Taufik Ibnu Salim is with Research Center for Smart Mechatronics, National Research and Innovation Agency, Bandung 40135, Indonesia (email:tauf021@brin.go.id).