ETA: Efficiency through Thinking Ahead, A Dual Approach to Self-Driving with Large Models

Hamdan, Shadi, Sima, Chonghao, Yang, Zetong, Li, Hongyang, Güney, Fatma

Jun-10-2025–arXiv.org Artificial Intelligence

How can we benefit from large models without sacrificing inference speed, a common dilemma in self-driving systems? A prevalent solution is a dual-system architecture, employing a small model for rapid, reactive decisions and a larger model for slower but more informative analyses. Existing dual-system designs often implement parallel architectures where inference is either directly conducted using the large model at each current frame or retrieved from previously stored inference results. However, these works still struggle to enable large models for a timely response to every online frame. Our key insight is to shift intensive computations of the current frame to previous time steps and perform a batch inference of multiple time steps to make large models respond promptly to each time step. T o achieve the shifting, we introduce Efficiency through Thinking Ahead (ETA), an asynchronous system designed to: (1) propagate informative features from the past to the current frame using future predictions from the large model, (2) extract current frame features using a small model for real-time responsiveness, and (3) integrate these dual features via an action mask mechanism that emphasizes action-critical image regions.

large language model, machine learning, small model, (19 more...)

arXiv.org Artificial Intelligence

Jun-10-2025

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.28)

Genre:
- Research Report (0.82)

Industry:
- Transportation > Ground > Road (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Robots > Autonomous Vehicles (0.87)
  - Natural Language > Large Language Model (0.70)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found