Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability

Neural Information Processing Systems 

World models can foresee the outcomes of different actions, which is of paramount importance for autonomous driving. Nevertheless, existing driving world models still have limitations in generalization to unseen environments, prediction fidelity of critical details, and action controllability for flexible application.