Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals
–Neural Information Processing Systems
Recent advances in video generation models have sparked interest in world models capable of simulating realistic environments. While navigation has been wellexplored, physically meaningful interactions that mimic real-world forces remain largely understudied. In this work, we investigate using physical forces as a control signal for video generation and propose force prompts which enable users to interact with images through both localized point forces, such as poking a plant, and global wind force fields, such as wind blowing on fabric. We demonstrate that these force prompts can enable videos to respond realistically to physical control signals by leveraging the visual and motion prior in the original pretrained model, without using any 3D asset or physics simulator at inference. The primary challenge of force prompting is the difficulty in obtaining high quality paired force-video training data, both in the real world due to the difficulty of obtaining force signals, and in synthetic data due to limitations in the visual quality and domain diversity of physics simulators.
Neural Information Processing Systems
Jun-19-2026, 23:01:32 GMT
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Information Technology (0.67)
- Leisure & Entertainment > Games
- Computer Games (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Representation & Reasoning (1.00)
- Machine Learning > Neural Networks (1.00)
- Natural Language > Large Language Model (0.68)
- Information Technology > Artificial Intelligence