Speech conversion is a new technology to change the source speaker's features to the target speaker's features in the speech. In this paper, the Chinese speech conversion system is divided in to three parts. In the first and second part, GMM(Gaussian Mixed Model) is used to transform the spectral envelope[LPC(Linear Prediction Coding)] and the impulse (residual). In the third part, the Chinese speech's super-segmental features is regulated with the SVR(Support Vector Regression) and the TD-PSOLA(Time-Domain Pitch Synchronous OverLap-Add). This algorithm is capital of transforming Chinese speech and producing spontaneous voice.
Audio Engineering