Diffusion StateSpaceDiffuser Ours
–Neural Information Processing Systems
World models have recently gained prominence for action-conditioned visual prediction in complex environments. However, relying on only a few recent observations causes them to lose long-term context. Consequently, within a few steps, the generated scenes drift from what was previously observed, undermining temporal coherence. This limitation, common in state-of-the-art world models, which are diffusion-based, stems from the lack of a lasting environment state. To address this problem, we introduce StateSpaceDiffuser, where a diffusion model is enabled to perform long-context tasks by integrating features from a state-space model, representing the entire interaction history.
Neural Information Processing Systems
Jun-17-2026, 20:49:00 GMT
- Country:
- Asia (0.28)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.67)
- Research Report
- Industry:
- Information Technology (1.00)
- Leisure & Entertainment > Games
- Computer Games (0.92)
- Technology: