期刊文献+

基于隐最大熵原理的汉语词义消歧方法 被引量:8

Chinese Word Sense Disambiguation Based on Latent Maximum Entropy Principle
下载PDF
导出
摘要 该文针对最大熵原理只能利用上下文中的显性统计特征构建语言模型的特点,提出了采用隐最大熵原理构建汉语词义消歧模型的方法。在研究了《知网》中词语与义原之间的关系之后,把从训练语料获取的文本上下文中的词语搭配信息转换为义原搭配信息,实现了基于义原搭配信息的文本隐性语义特征提取方法。在结合传统的上下文特征后,应用隐最大熵原理进行文本中多义词的词义消歧。实验结果表明,采用文中所提方法对十个多义动词进行词义消歧,正确率提高了约4%。 We present a new approach to Chinese word sense disambiguation based on latent maximum entropy principle(LME),which is different from Jaynes' maximum entropy principle that only use the context statistical characteristics to construct language model.After studying the relationship between the word and the sememe in Hownet,we convert the word collocation that obtained from the context of training corpus into the sememe collocation,and realize the extraction of text latent semantic features based on sememe collocations.Combined with the traditional context features,the latent maximum entropy principle is applied to disambiguate polysemy words.Experimental results show that the method proposed improves the accuracy by about 4% in the sense disambiguation of 10 polysemous verbs word.
出处 《中文信息学报》 CSCD 北大核心 2012年第3期72-78,共7页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目(60873013 61070119) 北京大学计算语言学教育部重点实验室开放课题基金资助项目(KLCL-1005) 北京市属市管高等学校人才强教计划资助项目(PHR201007131)
关键词 隐最大熵原理 文本隐性特征 义原搭配信息 词义消歧 latent maximum entropy principle text latent features sememes collocation information word sense disambiguation
  • 相关文献

参考文献15

  • 1张仰森.面向语言资源建设的汉语词义消歧与标注方法研究[D].北京大学博士后研究工作报告.2006.12. 被引量:1
  • 2Black,Ezra.An Experiment in ComputationalDiscrimination of English Word Sense[J].IBMJournal of Research and Development,1988,32(2):185-194. 被引量:1
  • 3Yarowsky,D.Decision Lists for Lexical AmbiguityResolution:Appliaction to Accent Restoration inSpanish and French[C] //Proceedings of the 32thAnnual Meeting of ACL.1994. 被引量:1
  • 4Escudero G,Marquez L,et al.Naive Bayes and examplar-based approaches to word sense disambiguation revisited[C] //Proceedings of the 14th Europear Conference on ArtificialIntelligence(ECAI),2000. 被引量:1
  • 5Schutze,H.Automatic word sense discrimination.Computational Linguistics[J].1998,24(1):97-124. 被引量:1
  • 6Adam L.Berger,Stephen A.Della Pietra,Vincent J.Della Pietra.A Maximum Entropy Approach toNatural Language Processing[J].ComputationalLinguistics,1996,22(1):1-36. 被引量:1
  • 7Gerald Chao,Michael G.Dyer,Maximum EntropyModels for Word Sense Disambiguation[C] //Proceeding of COLING 2002 1:155-161. 被引量:1
  • 8Kawamoto,A.H.Distributed representations ofambiguous words and their resolution in a connectionistnetwork[C] //Proceeding of Small,S.,ed.LexicalAmbiguity Resolution:Perspectives fromPsycholinguistics,Neuropsychology,and ArtificialIntelligence.San Mateo,CA:Morgan Kaufman,1998:195-228. 被引量:1
  • 9Ng,H.T.Exemplar-Based word sensedisambiguation:some recent improvements[C] //Proceeding of Johnson,M.,Allegrini,P.,eds.Proceedings of the 2nd Conference on EmpiricalMethods in Natural Language Processing.Providence,Rhode Island,1997:208-213. 被引量:1
  • 10Shaojun Wang,Dale Schuurmans,Yunxin Zhao.TheLatent Maximum Entropy Principle[C] //Proceedingof IEEE International Symposium on InformationTheory,2002:182-185. 被引量:1

二级参考文献22

共引文献838

同被引文献74

  • 1陈浩,何婷婷,姬东鸿.基于k-means聚类的无导词义消歧[J].中文信息学报,2005,19(4):10-16. 被引量:16
  • 2刘风成,黄德根,姜鹏.基于AdaBoost.MH算法的汉语多义词消歧[J].中文信息学报,2006,20(3):6-13. 被引量:7
  • 3魏伟.汉语离合词研究综述[J].锦州医学院学报(社会科学版),2006,4(4):80-83. 被引量:4
  • 4李峰,李芳.中文词语语义相似度计算——基于《知网》2000[J].中文信息学报,2007,21(3):99-105. 被引量:106
  • 5Chan Y S, Ng H T. Scaling up word sense disambigu ation via parallel texts[C]//Proceedings of AAAI 2005, 5: 1037-1042. 被引量:1
  • 6Mart nez D, Agirre E, Mrquez L. Syntactic features for high precision word sense disambiguation[C]// Proceedings of the 19th International Conference on Computational Linguistics-Volume 1. Association for Computational Linguistics, 2002: 1-7. 被引量:1
  • 7Che W, Liu T. Jointly modeling wsd and srl with markov logic[C]//Proceedings of the 23rd Internation- al Confei'ence on Computational Linguistics. Associa tion for Computational Linguistics, 2010 : 161-169. 被引量:1
  • 8Dang H T, Palmer M. The role of semantic roles in disambiguating verb senses [C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguis-tics, 2005: 42-49. 被引量:1
  • 9Escudero G, Mdrquez L, Rigau G. Naive Bayes and exemplar-based approaches to word sense disambigu- ation revisited[J], arXiv preprint cs/0007011, 2000. 被引量:1
  • 10Song F, Croft W B. A general language model/or in- formation retrieval[C]//Proceedings of the eighth in- ternational conference on information and knowledge management. ACM, 1999.. 316-321. 被引量:1

引证文献8

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部