MinimaxOptimalOnlineImitationLearningvia ReplayEstimation
–Neural Information Processing Systems
In the tabular setting or with linear function approximation, our meta theorem shows that the performance gap incurred by ourapproachachievestheoptimal eO min(H3/2/Nexp,H/ p Nexp dependency, undersignificantly weakerassumptions compared topriorwork.
Neural Information Processing Systems
Feb-8-2026, 03:37:36 GMT
- Country:
- Europe > Italy
- Sardinia (0.04)
- North America > United States
- Illinois > Cook County
- Chicago (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Illinois > Cook County
- Europe > Italy
- Technology: