On the basis of the pulse source-system model, a convenient and effective method used to evaluate the transfer function of human pulse system has been proposed by using the principle of signal detection and system ana...On the basis of the pulse source-system model, a convenient and effective method used to evaluate the transfer function of human pulse system has been proposed by using the principle of signal detection and system analysis. The experimental results show that the pulse system of the normal pulse has 3 formants; the smooth pulse, 2 formants; the wiry pulse, 4 formants; and the thready pulse, only 1 formant. Formant frequencies reflect the resonance behaviour of the arterial system.展开更多
The identification and classification of pathological voice are still a challenging area of research in speech processing. Acoustic features of speech are used mainly to discriminate normal voices from pathological vo...The identification and classification of pathological voice are still a challenging area of research in speech processing. Acoustic features of speech are used mainly to discriminate normal voices from pathological voices. This paper explores and compares various classification models to find the ability of acoustic parameters in differentiating normal voices from pathological voices. An attempt is made to analyze and to discriminate pathological voice from normal voice in children using different classification methods. The classification of pathological voice from normal voice is implemented using Support Vector Machine (SVM) and Radial Basis Functional Neural Network (RBFNN). The normal and pathological voices of children are used to train and test the classifiers. A dataset is constructed by recording speech utterances of a set of Tamil phrases. The speech signal is then analyzed in order to extract the acoustic parameters such as the Signal Energy, pitch, formant frequencies, Mean Square Residual signal, Reflection coefficients, Jitter and Shimmer. In this study various acoustic features are combined to form a feature set, so as to detect voice disorders in children based on which further treatments can be prescribed by a pathologist. Hence, a successful pathological voice classification will enable an automatic non-invasive device to diagnose and analyze the voice of the patient.展开更多
This paper proposes a novel voice conversion method by frequency warping. The frequency warping function is generated based on mapping formants of the source speaker and the target speaker. In addition to frequency wa...This paper proposes a novel voice conversion method by frequency warping. The frequency warping function is generated based on mapping formants of the source speaker and the target speaker. In addition to frequency warping, fundamental frequency adjustment, spectral envelope equalization, breathiness addition, and duration modification are also used to improve the similarity to the target speaker. The proposed voice conversion method needs only a very small amount of training data for generating the warping function, thereby greatly facilitating its application. Systems based on the proposed method were used for the 2007 TC-STAR intra-lingual voice conversion evaluation for English and Spanish and a cross-lingual voice conversion evaluation for Spanish. The evaluation results show that the proposed method can achieve a much better quality of converted speech than other methods as well as a good balance between quality and similarity. The IBM1 system was ranked No. 1 for English evaluation and No. 2 for Spanish evaluation. Evaluation results also show that the proposed method is a convenient and competitive method for crosslingual voice conversion tasks.展开更多
Vocal individuality is widespread in social animals. Individual variation in vocalizations is a prereq- uisite for discriminating among conspecifics and may have facilitated the evolution of large complex societies. R...Vocal individuality is widespread in social animals. Individual variation in vocalizations is a prereq- uisite for discriminating among conspecifics and may have facilitated the evolution of large complex societies. Ring-tailed lemurs Lemur catta live in relatively large social groups, have con- spicuous vocal repertoires, and their species-specific utterances can be interpreted in light of source-filter theory of vocal production. Indeed, their utterances allow individual discrimination and even recognition thanks to the resonance frequencies of the vocal tract. The purpose of this study is to determine which distinctive vocal features can be derived from the morphology of the upper vocal tract. To accomplish this, we built computational models derived from anatomical measurements collected on lemur cadavers and compared the results with the spectrographic out- put of vocalizations recorded from ex situ live individuals. Our results demonstrate that the mor- phological variation of the ring-tailed lemur vocal tract explains individual distinctiveness of their species-specific utterances. We also provide further evidence that vocal tract modeling is a power- ful tool for studying the vocal output of non-human primates.展开更多
本文借助超声仪采集了藏语安多方言元音的生理语音数据,系统分析了安多方言元音的动态舌位和稳定段的静态舌位,以及声学共振峰数据。实验结果显示,在舌位运动过程中确实存在一个稳定阶段,此阶段各帧数据间的差异都较小,将该阶段的舌位...本文借助超声仪采集了藏语安多方言元音的生理语音数据,系统分析了安多方言元音的动态舌位和稳定段的静态舌位,以及声学共振峰数据。实验结果显示,在舌位运动过程中确实存在一个稳定阶段,此阶段各帧数据间的差异都较小,将该阶段的舌位特征与古藏语相比,发现安多方言元音系统已经产生了一定的变化,即元音舌位由低到高依次为/a/、/i, u, o/、/e/,舌位由前到后分别为/e/、/i, u, a/、/o/,其中元音/i/和/u/央化并产生了新的音位变体。最后我们从空间域角度对安多方言元音的舌体音姿进行了总体描述。明确了元音在生理特征与声学特征上具有统一性,这对藏语不同方言之间的发音差异和共性研究均有一定的理论意义和参考价值。展开更多
文摘On the basis of the pulse source-system model, a convenient and effective method used to evaluate the transfer function of human pulse system has been proposed by using the principle of signal detection and system analysis. The experimental results show that the pulse system of the normal pulse has 3 formants; the smooth pulse, 2 formants; the wiry pulse, 4 formants; and the thready pulse, only 1 formant. Formant frequencies reflect the resonance behaviour of the arterial system.
文摘The identification and classification of pathological voice are still a challenging area of research in speech processing. Acoustic features of speech are used mainly to discriminate normal voices from pathological voices. This paper explores and compares various classification models to find the ability of acoustic parameters in differentiating normal voices from pathological voices. An attempt is made to analyze and to discriminate pathological voice from normal voice in children using different classification methods. The classification of pathological voice from normal voice is implemented using Support Vector Machine (SVM) and Radial Basis Functional Neural Network (RBFNN). The normal and pathological voices of children are used to train and test the classifiers. A dataset is constructed by recording speech utterances of a set of Tamil phrases. The speech signal is then analyzed in order to extract the acoustic parameters such as the Signal Energy, pitch, formant frequencies, Mean Square Residual signal, Reflection coefficients, Jitter and Shimmer. In this study various acoustic features are combined to form a feature set, so as to detect voice disorders in children based on which further treatments can be prescribed by a pathologist. Hence, a successful pathological voice classification will enable an automatic non-invasive device to diagnose and analyze the voice of the patient.
文摘This paper proposes a novel voice conversion method by frequency warping. The frequency warping function is generated based on mapping formants of the source speaker and the target speaker. In addition to frequency warping, fundamental frequency adjustment, spectral envelope equalization, breathiness addition, and duration modification are also used to improve the similarity to the target speaker. The proposed voice conversion method needs only a very small amount of training data for generating the warping function, thereby greatly facilitating its application. Systems based on the proposed method were used for the 2007 TC-STAR intra-lingual voice conversion evaluation for English and Spanish and a cross-lingual voice conversion evaluation for Spanish. The evaluation results show that the proposed method can achieve a much better quality of converted speech than other methods as well as a good balance between quality and similarity. The IBM1 system was ranked No. 1 for English evaluation and No. 2 for Spanish evaluation. Evaluation results also show that the proposed method is a convenient and competitive method for crosslingual voice conversion tasks.
文摘Vocal individuality is widespread in social animals. Individual variation in vocalizations is a prereq- uisite for discriminating among conspecifics and may have facilitated the evolution of large complex societies. Ring-tailed lemurs Lemur catta live in relatively large social groups, have con- spicuous vocal repertoires, and their species-specific utterances can be interpreted in light of source-filter theory of vocal production. Indeed, their utterances allow individual discrimination and even recognition thanks to the resonance frequencies of the vocal tract. The purpose of this study is to determine which distinctive vocal features can be derived from the morphology of the upper vocal tract. To accomplish this, we built computational models derived from anatomical measurements collected on lemur cadavers and compared the results with the spectrographic out- put of vocalizations recorded from ex situ live individuals. Our results demonstrate that the mor- phological variation of the ring-tailed lemur vocal tract explains individual distinctiveness of their species-specific utterances. We also provide further evidence that vocal tract modeling is a power- ful tool for studying the vocal output of non-human primates.
文摘本文借助超声仪采集了藏语安多方言元音的生理语音数据,系统分析了安多方言元音的动态舌位和稳定段的静态舌位,以及声学共振峰数据。实验结果显示,在舌位运动过程中确实存在一个稳定阶段,此阶段各帧数据间的差异都较小,将该阶段的舌位特征与古藏语相比,发现安多方言元音系统已经产生了一定的变化,即元音舌位由低到高依次为/a/、/i, u, o/、/e/,舌位由前到后分别为/e/、/i, u, a/、/o/,其中元音/i/和/u/央化并产生了新的音位变体。最后我们从空间域角度对安多方言元音的舌体音姿进行了总体描述。明确了元音在生理特征与声学特征上具有统一性,这对藏语不同方言之间的发音差异和共性研究均有一定的理论意义和参考价值。