Schrödinger Bridge Mamba for One-Step Speech Enhancement
Yang, Jing, Wang, Sirui, Wu, Chao, Fan, Fan
–arXiv.org Artificial Intelligence
ABSTRACT We propose Schr odinger Bridge Mamba (SBM), a new concept of training-inference framework motivated by the inherent compatibility between Schr odinger Bridge (SB) training paradigm and selective state-space model Mamba. Experiments on a joint denoising and dereverberation task using four benchmark datasets demonstrate that SBM, with only 1-step inference, outperforms strong baselines with 1-step or iterative inference and achieves the best real-time factor (RTF). Beyond speech enhancement, we discuss the integration of SB paradigm and selective state-space model architecture based on their underlying alignment, which indicates a promising direction for exploring new deep generative models potentially applicable to a broad range of generative tasks. Index T erms-- Schr odinger Bridge, Mamba, Deep generative model, Speech enhancement 1. INTRODUCTION Deep generative models have been increasingly employed for speech enhancement (SE) tasks. By learning the underlying distribution of clean audio given its degraded counterpart, generative models are capable of generating high-quality speech from low-quality inputs that include noise, reverberation, clipping, bandwidth limitation or a mixture of these artifacts.
arXiv.org Artificial Intelligence
Oct-21-2025