期刊文献+

基于扩展特征向量空间模型的多源数据融合 被引量:5

Multi-source data fusion based on the expand vector space model
原文传递
导出
摘要 本体资源的扩充是自然语言处理的关键问题之一。传统的从单一数据源获取的信息其覆盖率较低,亟需建立一个整体的数据管理平台,对数据资源分类存储与整理。为此提出了AVP数据平台,构建AVP平台所面临的重要问题是多源数据的融合,即将不同来源的网站数据进行语义角色标注,对歧义词条进行识别判断,并最终归并到以义项为基本单位的数据仓库中;为解决多源数据融合的语义角色标注问题,给出了一种自动语义判歧方法。其基本思想是利用词条中的属性值对作为特征模板,并借助于属性值的共现概率,应用扩展向量空间模型对词条进行歧义识别。通过大量的实验对比可知,该系统在各方面均取得优异的成绩,所提出的算法能够很好地解决多源数据融合中的语义判歧问题。 The expansion of ontology resource is one of the key for the whole natural language processing. Since the in- formation obtained traditionally from single data source could not reflect the overall picture and the coverage rate doesn' t reach targeted one, the construction of an integrated data management platform would be required to store and organize data sources by classification. The AVP data platform was proposed firstly. In the process of data construction on AVP platform, the most important issue is to integrate multi-source data, in other words, to perform semantic role labeling on web data coming from different sources, to identify ambiguous entries, and to eventually merge into data warehouses which use sense as the basic unit. An automated method of semantic role matching has been suggested, and it would solve the problem of semantic role matching resulted from multi-source data fusion. The basic idea is to use at- tribute-values of entries as the feature template, and then apply expand vector space model to identity ambiguity for en- tries while assisted by the co-occurrence probability of attribute values. Through the massive experimental contrast, the system mentioned above performed very well in all respects. The theory and algorithm proposed in this paper could solve the problem of semantic role matching existed in multi-source data fusion effectively.
作者 陈珂锐 潘君
出处 《山东大学学报(理学版)》 CAS CSCD 北大核心 2013年第11期87-92,共6页 Journal of Shandong University(Natural Science)
基金 国家自然科学基金资助项目(61202285) 国家级星火计划项目(2012GA750007) 河南省教育厅科学技术研究重点项目(12A120002) 河南省科技厅基础与前沿技术研究项目(122300410378) 河南财经政法大学学术创新骨干支持计划资助
关键词 自然语言处理 本体 多源数据融合 语义判歧 natural language processing ontology multi-source data fusion semantic role matching
  • 相关文献

参考文献6

  • 1BLEIHOLDER J, NAUMANN F. Data fusion[J].ACM Computing Surveys, 2008, 41 ( 1 ) : 1-41. 被引量:1
  • 2WU Shengli, Sally Mc-Lean. Performance prediction of data fusion for information retrieval [J].Information Processing & Management, 2006, 42(4) :899-915. 被引量:1
  • 3BILKE A, BLEIHOLDER J, BOHM C, et al. Automatic data fusion with HumMer [C]//Proceedings of the 31st International Conference on Very Large Data Bases. [S. l. ]: VLDB Endowment, 2005 : 1251-1254. 被引量:1
  • 4DONG Xin-Luna, BERTI-EQUILLE L, SRIVASTAVA D. Integrating conflicting data: the role of source depend- ence [J]. Proceedings of the VLDB Endowment, 2009, 2 ( 1 ) :550-561. 被引量:1
  • 5夏冰,潘磊,孙飞显,郑秋生,裴斐.基于多元数据融合和层次分析的评估模型[J].计算机工程,2010,36(9):153-155. 被引量:11
  • 6杨庚,王安琪,陈正宇,许建,王海勇.一种低耗能的数据融合隐私保护算法[J].计算机学报,2011,34(5):792-800. 被引量:58

二级参考文献20

  • 1David M,William H,Kishor S.Model-based Evaluation:From Dependability to Security[J].IEEE Transactions on Dependable and Secure Computing,2004,1(1):48-65. 被引量:1
  • 2Mustafa M A,Bahar J.Project Risk Assessment Using the Analytic Hierarchy Process[J].IEEE Transactions on Engineering Management,1991,1(38):46-52. 被引量:1
  • 3Bista R,Yoo H K,Chang J W.A new sensitive data aggregation scheme for protecting integrity in wireless sensor networks//Proceedings of the 10th IEEE International Conference on Computer and Information Technology.Bradford,UK,2010.2463-2470. 被引量:1
  • 4Eschenauer L,Gligor V D.A key-management scheme for distributed sensor networks//Proceedings of the 9th ACM Conference on Computer and Communications Security.Washington,USA,2002:41-47. 被引量:1
  • 5Levis P,Lee N,Welsh M,Culler D.TOSSIM:Accurate and scalable simulation of entire TinyOS applications//Proeeedings of the 1st International Conference on Embedded Networked Sensor Systems.Los Angeles,USA,2003:126-137. 被引量:1
  • 6Szewcayk R,Ferencz A.Energy implications of network sensor designs.Berkeley:Berkeley Wireless Research Center Report,2000. 被引量:1
  • 7Intanagonwiwat C,Govindan R,Estrin D.Directed diffusion:A scalable and robust communication paradigm for sensor networks//Proceedings of the 6th Annual International Conference on Mobile Computing and Networking.Boston,USA,2000:56-67. 被引量:1
  • 8Madden S,Franklin M J,Hellerstein J M.TAG:A tiny aggregation service for ad-hoc sensor networks//Proceedings of the 5th Symposium on Operating Systems Design and Implementation.New York,USA,2002:131-146. 被引量:1
  • 9He W,Liu X,Nguyen H,Nahrstedt K,Abdelzaher T.PDA:Privacy-preserving data aggregation in wireless sensor networks//Proceedings of the 26th IEEE International Conference on Computer Communications.Anchorage,AK,2007:2045-2053. 被引量:1
  • 10He W,Nguyen H,Liu X,Nahrstedt K,Abdelzaher T.iPDA:An integrity-protecting private dgta aggregation scheme for wireless sensor networks//Proceedings of the Military Communications Conference.San Diego,CA,2008:1-7. 被引量:1

共引文献67

同被引文献28

引证文献5

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部