PWM: Policy Learning with Large World Models