The Internet of Things(IoT)plays an essential role in the current and future generations of information,network,and communication development and applications.This research focuses on vocal tract visualization and mod...The Internet of Things(IoT)plays an essential role in the current and future generations of information,network,and communication development and applications.This research focuses on vocal tract visualization and modeling,which are critical issues in realizing inner vocal tract animation.That is applied in many fields,such as speech training,speech therapy,speech analysis and other speech production-related applications.This work constructed a geometric model by observation of Magnetic Resonance Imaging data,providing a new method to annotate and construct 3D vocal tract organs.The proposed method has two advantages compared with previous methods.Firstly it has a uniform construction protocol for all speech organs.Secondly,this method can build correspondent feature points between different speech organs.There are less than three control parameters can be used to describe every speech organ accurately,for which the accumulated contribution rate is more than 88%.By means of the reconfiguration,the model error is less than 1.0 mm.Regarding to the data from Chinese Magnetic resonance imaging(MRI),this is the first work of 3D vocal tract model.It will promote the theoretical research and development of the intelligent Internet of Things facing speech generation-related issues.展开更多
A three-dimensional (3-D) physiological articulatory model was developed to account for the biomechanical properties of the speech organs in speech production. Control of the model to investigate the mechanism of sp...A three-dimensional (3-D) physiological articulatory model was developed to account for the biomechanical properties of the speech organs in speech production. Control of the model to investigate the mechanism of speech production requires an efficient control module to estimate muscle activation patterns, which is used to manipulate the 3-D physiological articulatory model, according to the desired articulatory posture. For this purpose, a feedforward control strategy was developed by mapping the articulatory target to the corresponding muscle activation pattern via the intrinsic representation of vowel articulation. In this process, the articulatory postures are first mapped to the corresponding intrinsic representations; then, the articulatory postures are clustered in the intrinsic representations space and a nonlinear function is approximated for each cluster to map the intrinsic representation of vowel articulation to the muscle activation pattern by using general regression neural networks (GRNN). The results show that the feedforward control module is able to manipulate the 3-D physiological articulatory model for vowel production with high accuracy both acoustically and articulatorily.展开更多
In this study the mechanical version of the three-disk Tower of London task with changes in the movements was conducted by fifteen elderly participants with concurrent articulatory suppression. Also, this executive ta...In this study the mechanical version of the three-disk Tower of London task with changes in the movements was conducted by fifteen elderly participants with concurrent articulatory suppression. Also, this executive task was conducted without verbal secondary task and the results of these two states were com- pared with each other. From this comparison, got evidences based on inner speech role in more complicated Tower of London tasks, although in general, the results suggest a more outstanding role of inner scribe in spatial planning in this executive task. Then inner speech and inner scribe roles have been described in Tower of London task applying “Baddeley and Logie” working memory model.展开更多
Articulatory features describe how articulators are involved in making sounds.Speakers often use a more exaggerated way to pronounce accented phonemes,so articulatory features can be helpful in pitch accent detection....Articulatory features describe how articulators are involved in making sounds.Speakers often use a more exaggerated way to pronounce accented phonemes,so articulatory features can be helpful in pitch accent detection.Instead of using the actual articulatory features obtained by direct measurement of articulators,we use the posterior probabilities produced by multi-layer perceptrons(MLPs) as articulatory features.The inputs of MLPs are frame-level acoustic features pre-processed using the split temporal context-2(STC-2) approach.The outputs are the posterior probabilities of a set of articulatory attributes.These posterior probabilities are averaged piecewise within the range of syllables and eventually act as syllable-level articulatory features.This work is the first to introduce articulatory features into pitch accent detection.Using the articulatory features extracted in this way,together with other traditional acoustic features,can improve the accuracy of pitch accent detection by about 2%.展开更多
A method to synthesize formant targeted sounds based on speech production model and Reflection-Type Line Analog (RTLA) articulatory synthesis model is presented. The synthesis model is implemented with scattering pro...A method to synthesize formant targeted sounds based on speech production model and Reflection-Type Line Analog (RTLA) articulatory synthesis model is presented. The synthesis model is implemented with scattering process derived from a RTLA of vocal tract system according to the acoustic mechanism of speech production. The vocal-tract area function which controls the synthesis model is derived from the first three formant trajectories by using the inverse solution of speech production. The proposed method not only gives good naturalness and dynamic smoothness, but also is capable to control or modify speech timbres easily and flexibly. Further and mores it needs less number of control parameters and very low update rate of the parameters.展开更多
基金This work was supported by the Regional Innovation Cooperation Project of Sichuan Province(Grant No.2022YFQ0073).
文摘The Internet of Things(IoT)plays an essential role in the current and future generations of information,network,and communication development and applications.This research focuses on vocal tract visualization and modeling,which are critical issues in realizing inner vocal tract animation.That is applied in many fields,such as speech training,speech therapy,speech analysis and other speech production-related applications.This work constructed a geometric model by observation of Magnetic Resonance Imaging data,providing a new method to annotate and construct 3D vocal tract organs.The proposed method has two advantages compared with previous methods.Firstly it has a uniform construction protocol for all speech organs.Secondly,this method can build correspondent feature points between different speech organs.There are less than three control parameters can be used to describe every speech organ accurately,for which the accumulated contribution rate is more than 88%.By means of the reconfiguration,the model error is less than 1.0 mm.Regarding to the data from Chinese Magnetic resonance imaging(MRI),this is the first work of 3D vocal tract model.It will promote the theoretical research and development of the intelligent Internet of Things facing speech generation-related issues.
基金Supported partly by the Promoting Science and Technology by the Japan Ministry of Education,Culture,Sports,Science and Technology and the SCOPE of the Ministry of Internal Affairs and Communications (MIC),Japan (No.071705001)
文摘A three-dimensional (3-D) physiological articulatory model was developed to account for the biomechanical properties of the speech organs in speech production. Control of the model to investigate the mechanism of speech production requires an efficient control module to estimate muscle activation patterns, which is used to manipulate the 3-D physiological articulatory model, according to the desired articulatory posture. For this purpose, a feedforward control strategy was developed by mapping the articulatory target to the corresponding muscle activation pattern via the intrinsic representation of vowel articulation. In this process, the articulatory postures are first mapped to the corresponding intrinsic representations; then, the articulatory postures are clustered in the intrinsic representations space and a nonlinear function is approximated for each cluster to map the intrinsic representation of vowel articulation to the muscle activation pattern by using general regression neural networks (GRNN). The results show that the feedforward control module is able to manipulate the 3-D physiological articulatory model for vowel production with high accuracy both acoustically and articulatorily.
文摘In this study the mechanical version of the three-disk Tower of London task with changes in the movements was conducted by fifteen elderly participants with concurrent articulatory suppression. Also, this executive task was conducted without verbal secondary task and the results of these two states were com- pared with each other. From this comparison, got evidences based on inner speech role in more complicated Tower of London tasks, although in general, the results suggest a more outstanding role of inner scribe in spatial planning in this executive task. Then inner speech and inner scribe roles have been described in Tower of London task applying “Baddeley and Logie” working memory model.
基金Project(Nos.61370034,61273268,and 61005019) supported by the National Natural Science Foundation of China
文摘Articulatory features describe how articulators are involved in making sounds.Speakers often use a more exaggerated way to pronounce accented phonemes,so articulatory features can be helpful in pitch accent detection.Instead of using the actual articulatory features obtained by direct measurement of articulators,we use the posterior probabilities produced by multi-layer perceptrons(MLPs) as articulatory features.The inputs of MLPs are frame-level acoustic features pre-processed using the split temporal context-2(STC-2) approach.The outputs are the posterior probabilities of a set of articulatory attributes.These posterior probabilities are averaged piecewise within the range of syllables and eventually act as syllable-level articulatory features.This work is the first to introduce articulatory features into pitch accent detection.Using the articulatory features extracted in this way,together with other traditional acoustic features,can improve the accuracy of pitch accent detection by about 2%.
基金This work is supported by National Natural Science Foundation of China !(69972046)the NSF of Zhejiang Province! (698076)
文摘A method to synthesize formant targeted sounds based on speech production model and Reflection-Type Line Analog (RTLA) articulatory synthesis model is presented. The synthesis model is implemented with scattering process derived from a RTLA of vocal tract system according to the acoustic mechanism of speech production. The vocal-tract area function which controls the synthesis model is derived from the first three formant trajectories by using the inverse solution of speech production. The proposed method not only gives good naturalness and dynamic smoothness, but also is capable to control or modify speech timbres easily and flexibly. Further and mores it needs less number of control parameters and very low update rate of the parameters.