TOM: Learning Policy-Aware Models for Model-Based Reinforcement Learning via Transition Occupancy Matching

Open in new window