Reinforcement Learning from Delayed Observations via World Models