期刊文献+

数据清洗方法与构件的综合技术研究 被引量:4

AN INTEGRATED TECHNOLOGY OF METHOD AND COMPONENT FOR DATA CLEANING
下载PDF
导出
摘要 为了满足分布式应用系统中的数据质量要求,需要设计数据清洗方法与构件的共享环境。提出了数据清洗方法与构件的综合模型,阐述方法模型、过程模型和构件模型,以满足使用构件时的检索和选用要求。通过一种网络映射图方法,描述过程模型与方法模型的组合特征,并给出了数据清洗方法实例。在数据清洗构件的描述方面,给出了基于形式语言的构件描述,采用XMLSchema设计了Header、Deployment、Form、Function和Implementation共5种刻面及其它们的子刻面。以数据删除任务为例,详细阐述了数据删除与恢复方法的设计过程和算法描述,给出了相应构件的XML模式表示与实现的操作界面。提出的方法与构件综合技术已在实际科研项目中发挥重要作用。 A kind of sharing environment of data cleaning methods and components is demanded to meet the requirement of data quality in distributed application system. An integrated model which includes method model, process model and component model, was created for data cleaning methods and component design to meet the requirement of searching and selection in component using in the future. The combination of process model and method model was described by a kind of network mapping diagram. Lots of cleaning method instances were stated in the diagram. For the data cleaning components, description based on formal language was designed. Moreover, five facets of Header, Deployment, Form, Function and Implementation were designed with XML Schema. Finally, a case study was presented for data delete function. Taken data deletion for example, the design process and algorithm description of data deleting and recovering method were stated in detail and the corresponding component was denoted with XML Schema. Meanwhile, two operation interfaces were demonstrated for its implementation. The integrated technology of method and component has been applied in actual scientific projects.
作者 张晓明 乔溪
出处 《石油化工高等学校学报》 CAS 2005年第2期67-71,共5页 Journal of Petrochemical Universities
基金 北京市教委科技发展计划面上项目 (KM2 0 0 5 10 0170 0 6 )。
关键词 数据清洗 方法 构件 模型 可扩展标记语言 Data cleaning Method Component Model XML
  • 相关文献

参考文献8

  • 1常继传,李克勤,郭立峰,梅宏,杨芙清.青鸟系统中可复用软件构件的表示与查询[J].电子学报,2000,28(8):20-23. 被引量:80
  • 2Rahm E, Do H H. Data cleaning: problems and current approaches[J]. IEEE engineering bulletin, 2000, 23(4): 3 - 13. 被引量:1
  • 3Mong Li Lee, Wynne H, Vijay K. Cleaning the spurious links in data[J]. IEEE intelligent systems, 2004, (Mar./Apr. ), 28- 33. 被引量:1
  • 4Galhardas H. Declarative data cleaning: language, model and algorithms[A]. Proceedings of the 27th VLDB conference[C],Roma, Italy, 2001:371 - 380. 被引量:1
  • 5Lee M L, Ling T W, Low W L. IntelliClean: a knowledge- based intelligent data cleaner[A]. In: proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining[C]. Boston: ACM press, 2000, 290- 294. 被引量:1
  • 6Raman V, Hellerstein J. Potter's wheel: an interactivedata cleaning system[A]. Proceedings of the 27th VLDB conference[C], Roma, Italy, 2001:381 - 390. 被引量:1
  • 7Zhang Xiaoming, Qiao Xi. Research on knowledge management for enterprise modeling methodology[A]. Proc. of the 10th joint international computer conference[C], Kunming, China, 2004. 被引量:1
  • 8Raimund D, Michael H, Klaus M. Contigra: an XML- based architecture for component - oriented 3D applications[A].Web3D'02[C], Tempe, Arizona, USA. 2002:24-28. 被引量:1

二级参考文献2

共引文献79

同被引文献27

引证文献4

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部