Audio-Aware Large Language Models as Judges for Speaking Styles

Open in new window