
基于分类及最佳匹配读音的维吾尔多音词消歧 被引量:1

Uyghur Homograph Disambiguation Based on Classification and Optimal Mapping Pronunciation
摘要 研究维吾尔语中的多音词现象,根据多音词的不同特点进行分类。利用词性和读音的映射关系消歧第1类多音词。根据词缀连接词干后是否发生元音弱化的特点消歧第2类多音词。提取上下文语境信息,使用最佳匹配读音的方法消歧第3类多音词。采用似然比方法进行关键词选择,并对不同窗口宽度的关键词选取方法进行对比实验。结果表明,该方法可以得到错误率为20.9%的多音词消歧效果。 This paper deeply investigates the homograph in Uyghur language and classifies them according to the different features of homograph,disambiguates the first type of homograph according to the mapping relation between the part of speech and pronunciation,disambiguates the second type of homograph according to vowel weakening when suffix attaches to a stem,and optimal pronunciation mapping method is used to disambiguate the third type of homograph by extracting the contextual features of homograph.Log-likelihood ratio is used to select and keyword selection experiment of different window size is also conducted.Experimental result shows that the homograph disambiguation performance of can be got to 20.9% error rate through the research idea of this paper.
出处 《计算机工程》 CAS CSCD 2012年第18期22-25,共4页 Computer Engineering
基金 国家自然科学基金资助项目(61065005,61062008) 教育部新世纪优秀人才支持计划基金资助项目(NCET-10-0969)
关键词 维吾尔语 多音词消歧 分类 元音弱化 最佳匹配读音 关键词选取 Uyghur language; homograph disambiguation; classification; vowel weakening; optimal mapping pronunciation; keyword selection
  • 相关文献


  • 1Zhang Hong, Yu Jiangsheng, Zhan Weidong. Disambiguation of Chinese Polyphonic Characters[C]//Proc. of the 1st International Workshop on Multimedia Annotation. Tokyo, Japan: [s. n.], 2001. 被引量:1
  • 2Yarowsky D. Homograph Disambiguation in Speech Synthesis[C]// Proc. of Progress in Speech Synthesis. [S. l.]: Springer-Verlag, 1997: 159-175. 被引量:1
  • 3Wang Wern-Jun, Hwang Shaw-Hwa, Chen Sin-Horng. The Broad Study of Homograph Disambiguity for Mandarin Speech Synthesis[C]//Proc. of the 4th International Conference on Spoken Language. [S. l.]: IEEE Press, 1996: 1389-1392. 被引量:1
  • 4Zheng Min, Shi Qin, et al. Grapheme-to-Phoneme Conversion Based on TBL Algorithm in Mandarin TTS System[C]//Proc. of the 9th European Conference on Speech Communication and Technology. Lisbon, Portugal: [s. n.], 2005: 1897-1900. 被引量:1
  • 5Liu Fangzhou, Shi Qin, Tao Jianhua. Tree-guided Transformation- based Homograph Disambiguation in Mandarin TTS System[C]// Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing. [S. l.]: IEEE Press, 2008: 4657-4660. 被引量:1
  • 6阿不都沙拉木?阿巴斯. 维吾尔语同形词词典[M]. 北京: 民族出版社, 1996. 被引量:1
  • 7Ablimit M, Eli M, Kawahara T. Partly Supervised Uyghur Morpheme Segmentation[C]//Proc. of Oriental COCOSDA Workshop. Marrakech, Morocco: [s. n.], 2008: 71-76. 被引量:1
  • 8米热古丽.艾力,米吉提.阿不力米提,艾斯卡尔.艾木都拉.基于词法分析的维吾尔语元音弱化算法研究[J].中文信息学报,2008,22(4):43-47. 被引量:17


  • 1古丽拉.阿东别克,米吉提.阿布力米提.维吾尔语词切分方法初探[J].中文信息学报,2004,18(6):61-65. 被引量:39
  • 2CHRISTOPHER D,MANNING,HINRICH SCHUTZE.统计自然语言处理基础[M].苑春法译.北京:电子工业出版社,2005:143-163. 被引量:5
  • 3James Allen.自然语言处理[M].北京:电子工业出版社,2005. 被引量:2
  • 4Daniel Jurafsky,James H.Martin.自然语言处理综论[M].北京:电子工业出版社,2005. 被引量:4
  • 5米吉提·阿不力米提,等:维吾尔语中的语音和谐规律及算法的实现[C]//中国科协,2005年会.2005. 被引量:1
  • 6米尔苏里坦·吾斯曼.现代维吾尔语文拼写与发音词典[M].乌鲁木齐:新疆人民出版社.1997.10. 被引量:1
  • 7米吉提·阿不力米提,古丽拉·阿东别克.新疆少数民族多文种文字处理技术[C]//中日自然语言处理国际研讨会.北京大学,2001.11. 被引量:1
  • 8海米体·铁木尔.现代维吾尔语语法(词汇学)[M].北京:民族出版社,1987. 被引量:2
  • 9米吉提·阿不力米提,艾斯卡尔·艾木都拉吐尔地·托合提.维语词法分析器研究开发[C]//全国第11届少数民族语言文字信息处理学术研讨会.西双版纳,2007.2. 被引量:1
  • 10Gulila Adongbieke. Research of Proofreading for the Uighur Character [C]//The 2001 IEEE International Conference on System, Man and Cybernetics (SMC2001). Tucson, Arizona, U. S. A: 2001. 874- 876. 被引量:1











使用帮助 返回顶部