期刊文献+

数据质量检测规则挖掘方法 被引量:8

Mining Method for Data Quality Detection Rules
原文传递
导出
摘要 数据质量规则是检测数据库质量的关键.为从关系数据库中自动发现数据质量规则,并以其为依据检测错误数据,研究质量规则表示形式及其评估度量,提出以数据项分组及其可信度为依据的最小质量规则计算准则、挖掘算法以及采用质量规则检测错误数据的思路.该数据质量规则形式借鉴关联规则的可信度评估机制、条件函数依赖的表达能力,统一描述函数依赖、条件函数依赖、关联规则等,具有简洁、客观、全面、检测异常数据准确等特性.与相关研究相比,降低挖掘算法的时间复杂度,提高检错率.用实验证明该方法的有效性和正确性. Data quality rules are key to the database quality detection. To discover data quality rules from relational databases automatically and detect the error or abnormal data based on them, the form and evaluation measures of data quality rules are studied, and criterions of computing data quality rules are presented based on data item groups and the confidence threshold. The algorithms of mining minimal data quality rules and the main idea of detecting data errors using data quality rules are also given. The new form of data quality rules makes use of confidence mechanism of association rules and the expression of conditional functional dependencies to describe functional dependencies, conditional functional dependencies and association rules in the same format. It can be concluded that this kind of data quality rules has the properties of conciseness, objectivity, completeness and accuracy of detecting the error or abnormal data. Compared with other related research work, the proposed algorithms have lower temporal complexity, and the discovered quality rules improve the detecting rate. The effectiveness and correctness of the proposed methods are proved by the experiments.
作者 刘波 耿寅融
出处 《模式识别与人工智能》 EI CSCD 北大核心 2012年第5期835-844,共10页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.61003056) 广东省自然科学基金项目(No.S2012010008831) 广东省科技攻关项目(No.2010B010600026)资助
关键词 数据质量规则 检测 挖掘 数据项分组 Data Quality Rule, Detection, Mining, Data Item Group
  • 相关文献

参考文献3

二级参考文献65

  • 1沈睿芳,郭立甫,时希杰.数据挖掘中的数据预处理模型与算法研究[J].计算机系统应用,2005,14(7):44-46. 被引量:20
  • 2Benge J, Jordan G M W, Smith P, et al. Global Data Management Survey: The new economy is the data economy[R]. Coopers, Price Waterhouse, 2001. 被引量:1
  • 3Eckerson W W. Data Quality and the bottom line: achieving busi- ness success through a commitment to highquality data. Data Warehousing Institute, 2002. 被引量:1
  • 4English L. Plain English on data quality : Information quality management:The next frontier[J]. DM Review Magazine, 2000. 被引量:1
  • 5Mullins C S. Database Administration: The Complete Guide to Practices and Procedures[M]. Addison Wesley. 被引量:1
  • 6Codd E F. Relational Completeness of Data Base Sublanguages [C]// Rustin R J, ed. Data Base Systems, Courant Computer Science Symposia. Vol. 6, Englewood Cliffs, N. J :PrenticeHall, 1972. 被引量:1
  • 7Korth,A. S. a. H. F. Database System Concepts[M]. McGrawHill,1986. 被引量:1
  • 8Ullman J D. Principles of Database Systems[M]. Computer Science Press, 1982. 被引量:1
  • 9Abiteboul S, Vianu R H V. Foundations of Databases[M]. Addison Wesley, 1995. 被引量:1
  • 10Beeri C M Y V. The implication problem for data dependencies[C]// Proc. Intl. Conf. on Algorithms, Languages and Programming. Berlin: Springer-Verlag, 1981. 被引量:1

共引文献19

同被引文献70

引证文献8

二级引证文献70

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部