Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset