期刊文献+

维吾尔语的N-gram语言模型研究 被引量:1

Research of Uyghur N-gram Model
下载PDF
导出
摘要 针对基于维吾尔语的N-gram模型统计数据稀疏问题造成统计模型识别性能降低,研究针对政府文献和报告领域的语料进行了1到3元文法统计,采用加法、线性插值、Witten-Bell和Kneser-Ney平滑算法进行了约束。结果表明,本实验中Kneser-Ney平滑技术可以大大降低统计维吾尔语的N-gram模型的困惑度。 For the reasons that statistic data sparse problem of Uyghur N-gram model caused statistic model low recognition performance,A N-gram model smoothing algorithm which is adapt to the Uyghur language was put forward.A 1-gram to 3-gram probability statistics were built in government references and Government reports domains,Addition,Linear interpolation,Witten-Bell and Kneser–Ney smoothing algorithm to added the grammar control.The results of the experiments shows that the perplexity of statistic models is decreased greatly by using the Kneser –Ney smoothing.
作者 张亚军
机构地区 昌吉学院
出处 《电脑知识与技术(过刊)》 2011年第6X期4177-4179,共3页 Computer Knowledge and Technology
关键词 语言模型 平滑算法 困惑度 维吾尔语-汉语双语语料 language model smoothing algorithm perplexity Uyghur language & Chinese parallel corpus
  • 相关文献

参考文献5

  • 1Shengwei Tian,Turgun Ibrahim.Chinese-Uighur Sentence Alignment Based on Hybrid Strategy with Mistake Spread Suppression[].ESIAT.2009 被引量:1
  • 2江铭虎,朱小燕,袁保宗.一种适应域的汉语N-gram语言模型平滑算法[J].清华大学学报(自然科学版),1999,39(9):99-102. 被引量:9
  • 3Roth D,Zelenko D.Part of speech tagging using a networkof linear separators[].Proceedings of the th AnnualMeeting of the Association for Computational Linguisticsand th International Conference on Computational Lin-guistics.1998 被引量:1
  • 4Lidstone G J.Note on the gereral case of the Bayes-Laplace formala for inductive or a posteriori probabiblities[].Transaactions of theFaculty of Actuaries.1992 被引量:1
  • 5徐志明,王晓龙,关毅.N-gram语言模型的数据平滑技术[J].计算机应用研究,1999,16(7):37-39. 被引量:10

二级参考文献2

  • 1Chen Stanleyf,博士学位论文,1996年 被引量:1
  • 2Zhou M,IEICE Trans Inf Syst,1996年,E79卷,4期,333页 被引量:1

共引文献15

同被引文献1

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部