Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models