Speech Modelling Using Subspace and EM Techniques

Smith, Gavin, Freitas, João F. G. de, Robinson, Tony, Niranjan, Mahesan

Neural Information Processing Systems 

The speech waveform can be modelled as a piecewise-stationary linear stochastic state space system, and its parameters can be estimated using an expectation-maximisation (EM) algorithm. One problem is the initialisation of the EM algorithm. Standard initialisation schemes can lead to poor formant trajectories. But these trajectories however are important for vowel intelligibility. The aim of this paper is to investigate the suitability of subspace identification methods to initialise EM. The paper compares the subspace state space system identification (4SID) method with the EM algorithm. The 4SID and EM methods are similar in that they both estimate a state sequence (but using Kalman ters fil and Kalman smoothers respectively), and then estimate parameters (but using least-squares and maximum likelihood respectively).

Similar Docs  Excel Report  more

TitleSimilaritySource
None found