Giuliani, Diego
Connectionist Speaker Normalization with Generalized Resource Allocating Networks
Furlanello, Cesare, Giuliani, Diego, Trentin, Edmondo
The paper presents a rapid speaker-normalization technique based on neural network spectral mapping. The neural network is used as a front-end of a continuous speech recognition system (speakerdependent, HMM-based) to normalize the input acoustic data from a new speaker. The spectral difference between speakers can be reduced using a limited amount of new acoustic data (40 phonetically rich sentences). Recognition error of phone units from the acoustic-phonetic continuous speech corpus APASCI is decreased with an adaptability ratio of 25%. We used local basis networks of elliptical Gaussian kernels, with recursive allocation of units and online optimization of parameters (GRAN model). For this application, the model included a linear term. The results compare favorably with multivariate linear mapping based on constrained orthonormal transformations.
Connectionist Speaker Normalization with Generalized Resource Allocating Networks
Furlanello, Cesare, Giuliani, Diego, Trentin, Edmondo
The paper presents a rapid speaker-normalization technique based on neural network spectral mapping. The neural network is used as a front-end of a continuous speech recognition system (speakerdependent, HMM-based)to normalize the input acoustic data from a new speaker. The spectral difference between speakers can be reduced using a limited amount of new acoustic data (40 phonetically richsentences). Recognition error of phone units from the acoustic-phonetic continuous speech corpus APASCI is decreased with an adaptability ratio of 25%. We used local basis networks of elliptical Gaussian kernels, with recursive allocation of units and online optimization of parameters (GRAN model). For this application, themodel included a linear term. The results compare favorably with multivariate linear mapping based on constrained orthonormal transformations.