Video Instruction Tuning With Synthetic Data