When language and vision meet road safety: leveraging multimodal large language models for video-based traffic accident analysis