Prompt-based Consistent Video Colorization

Dani, Silvia, Uricchio, Tiberio, Seidenari, Lorenzo

Dec-1-2025–arXiv.org Artificial Intelligence

Existing video colorization methods struggle with temporal flickering or demand extensive manual input. We propose a novel approach automating high-fidelity video colorization using rich semantic guidance derived from language and segmentation. We employ a language-conditioned diffusion model to colorize grayscale frames. Guidance is provided via automatically generated object masks and textual prompts; our primary automatic method uses a generic prompt, achieving state-of-the-art results without specific color input. Temporal stability is achieved by warping color information from previous frames using optical flow (RAFT); a correction step detects and fixes inconsistencies introduced by warping. Evaluations on standard benchmarks (DAVIS30, VIDEVO20) show our method achieves state-of-the-art performance in colorization accuracy (PSNR) and visual realism (Colorfulness, CDC), demonstrating the efficacy of automated prompt-based guidance for consistent video colorization.

colorization, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Dec-1-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Italy (0.14)

Genre:
- Research Report > Promising Solution (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning (1.00)
  - Vision (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found