Rethinking Audio-visual Synchronization for Active Speaker Detection

Open in new window