Fine-Tuning Large Multimodal Models for Automatic Pronunciation Assessment

Open in new window