SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model

Open in new window