期刊文献+

数据清洗技术在网络教学评价体系中的应用 被引量:1

An application of Data cleaning technology in network teaching evaluation system
下载PDF
导出
摘要 描述网络教学的数据仓库中包含了从各种数据源导入的大量数据,数据的质量问题会直接影响教学评价的效果。针对学生重复信息的处理,文中提出了基于数据类型进行分词的策略,结合编辑距离算法可有效检测出重复的学生基本信息,实验结果表明该方法能有效提高算法的执行效率及检测精度。 Data warehouse for network teaching includes a variety of data which are from different data sources,Data quality problem will di- rectly influence the effect of teaching evaluation.Aiming at the processing of students duplicate information,an segment strategy based on data type is proposed.The similarity computation algorithm of edit distance is presented:The experiment results indicate that this method can detect approximately duplicated records effectually,the algorithm running efficiency and detect precision can be improved.
作者 刘哲
出处 《网络与信息》 2011年第8期40-41,共2页 Network & Information
基金 辽宁省十一五规划项目 课题编号:JG 10DB192
关键词 相似重复记录 分词 编辑距离算法 Approximately duplicated records Segment algorithm of edit distance
  • 相关文献

参考文献5

二级参考文献41

  • 1[1]Bitton D, DeWitt D J. Duplicate record elimination in large data files. ACM Trans Database Systems, 1983, 8(2):255-65 被引量:1
  • 2[2]Hernandez M, Stolfo S. The Merge/Purge problem for large databases. In: Proc ACM SIGMOD International Conference on Management of Data, 1995. 127-138 被引量:1
  • 3[3]Howard B Newcombe, Kennedy J M, Axford S J, James A P. Automatic linkage of vital records. Science, 1959, 130:954-959 被引量:1
  • 4[4]DeWitt D J, Naught J F, Schneider D A. An evaluation of non-equijoin algorithms. In: Proc 17th International Conference on Very Large Databases, Barcelona, Spain, 1991. 443-452 被引量:1
  • 5[5]Hylton J A. Identifying and merging related bibliographic records[MS dissertation]. MIT: MIT Laboratory for Computer Science Technical Report 678, 1996 被引量:1
  • 6[6]Monge A E, Elkan C P. An efficient domain-independent algorithm for detecting approximately duplicate database records. In: Proc DMKD'97, Tucson Arizona, 1997 被引量:1
  • 7[7]Kukich K. Techniques for automatically correcting words in text. ACM Computing Surveys, 1992, 24(4):377-439 被引量:1
  • 8[8]Wagner R A, Fischer M J. The string-to-string correction problem. J ACM, 1974, 21(1):168-173 被引量:1
  • 9[9]Lowrance R, Robert A Wagner. An extension of the string-to-string correction problem. J ACM, 1975, 22(2):177-183 被引量:1
  • 10[10] Sellers P H. On the theory and computation of evolutionary distances. SIAM J Applied Mathematics, 1974, 26(4):787-793 被引量:1

共引文献98

同被引文献5

引证文献1

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部