Instant Video Models: Universal Adapters for Stabilizing Image-Based Networks
–Neural Information Processing Systems
When applied sequentially to video, frame-based networks often exhibit temporal inconsistency--for example, outputs that flicker between frames. This problem is amplified when the network inputs contain time-varying corruptions. In this work, we introduce a general approach for adapting frame-based models for stable and robust inference on video. We describe a class of stability adapters that can be inserted into virtually any architecture and a resource-efficient training process that can be performed with a frozen base network. We introduce a unified conceptual framework for describing temporal stability and corruption robustness, centered on a proposed accuracy-stability-robustness loss. By analyzing the theoretical properties of this loss, we identify the conditions where it produces well-behaved stabilizer training.
Neural Information Processing Systems
Jun-20-2026, 22:22:12 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.67)
- Research Report
- Industry:
- Information Technology (0.93)
- Technology: