Señorita-2M: AHigh-Quality Instruction-based Dataset for General Video Editing by Video Specialists
–Neural Information Processing Systems
Video content editing has a wide range of applications. With the advancement of diffusion-based generative models, video editing techniques have made remarkable progress, yet they still remain far from practical usability. Existing inversion-based video editing methods are time-consuming and struggle to maintain consistency in unedited regions. Although instruction-based methods have high theoretical potential, they face significant challenges in constructing high-quality training datasets - current datasets suffer from issues such as editing correctness, frame consistency, and sample diversity. To bridge these gaps, we introduce the Señorita2M dataset, a large-scale, diverse, and high-quality video editing dataset.
Neural Information Processing Systems
Jun-15-2026, 02:32:23 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.93)
- Research Report
- Industry:
- Leisure & Entertainment (1.00)
- Media > Photography (0.47)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Natural Language (1.00)
- Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence