A Simple Baseline for Audio-Visual Scene-Aware Dialog

Open in new window