Building reliable sim driving agents by scaling self-play