From Image to Video: An Empirical Study of Diffusion Representations