期刊文献+

基于预处理的决策树在化学数据挖掘中的应用 被引量:2

Decision Tree Based on Pretreatment and Its Application in Chemical Data Mining
下载PDF
导出
摘要 化学数据挖掘可从海量数据中提取蕴含的知识,决策树方法是一种重要的挖掘工具。鉴于决策树在处理连续数据上的局限性,本研究提出先进行预处理,将连续属性离散化,通过特征选择删除其冗余量,以此为基础构建决策树。该方法可防止决策树模型“过细”,使之具有良好的预报性能。将此方法应用于两个化学样品分类实例,效果良好。与贝叶斯分析和单一的决策树方法相比,其预报正确率有显著提高,且表达形式直观明确,易于理解和分析,适用于化学分类知识模式的挖掘。 Chemical data mining can discover valuable knowledge from a large amount of data. As a data mining technique, decision tree is an important tool. Considering its limitation in dealing with continuous datasets. The pretreatment including discretization and feature selection was used to discretize continuous data and reduce the redundant attributes. Based on these steps, application of the decision tree classifier that was built can not only avoid over-fitting, but also have good predicting capacity. This method was applied to the deection of the glass and wine chemcial classification instances with good result that the prediction correct rates are 94.7% and 96.67 and the self -check correct rates are 95.5% and 96.88%, respectively. Compared with Bayes discriminant analysis and traditional decision tree algorithm, the correct prediction rate of this model is greatly improved and the classification rules that it produces are explicit and easy to understand. All these merits show that decision tree is a good tool for mining chemical pattern classification rules.
出处 《分析化学》 SCIE EI CAS CSCD 北大核心 2005年第8期1091-1094,共4页 Chinese Journal of Analytical Chemistry
基金 国家自然科学基金项目(No.20276063) 浙江省重点科技项目(No.2004C21054)资助课题
关键词 预处理 决策树 化学数据挖掘 离散化 特征选择 化学模式分类 Data mining, decision tree, discretization, feature selection, chemical pattern classification
  • 相关文献

参考文献14

  • 1陈德钊编著..多元数据处理[M].北京:化学工业出版社,1998:302.
  • 2张海霞,朱彭龄.固相萃取[J].分析化学,2000,28(9):1172-1180. 被引量:174
  • 3Hart J, Kamber M, Data mining: Concepts and Techniques, Morgan Kaufmann Pubhshers, 2000. 被引量:1
  • 4束志恒,方士,陈德钊,陈亚秋.基于正则化网络-遗传算法的属性筛选及其在化学模式识别中的应用[J].分析化学,2003,31(10):1169-1172. 被引量:5
  • 5Quinlan J R. Machine Learning, 1986, 1 : 81 - 106. 被引量:1
  • 6Quinlan J R. CA. 5 : Programs for Machine Learning, Morgan Kaufmann, 1993. 被引量:1
  • 7Quinlan J R. Journal of Artificial Intelligence Research, 1996, 4:77 -90. 被引量:1
  • 8Liu H, Hussain F, Tan C L, Dash M. Machine Learning and Knowledze ,Discoverr, 2002. 6:393 -423. 被引量:1
  • 9Dougherty J, Konavi R, Sahami M. In Proceeding of ICMI-95, 12th International Conference on Machine Learning, Morgan Kaufmann, 1995:194-202. 被引量:1
  • 10Fayyad U, Irani K. In Proceedings of IJCAI-93 , 13th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, 1993:1022 - 1027. 被引量:1

二级参考文献9

  • 1朱彭龄,王多加,屈莹.直接进样分析用固定相[J].中国药学杂志,1994,29(3):129-134. 被引量:10
  • 2边肇祺.模式识别[M].清华大学出版社,1999.. 被引量:61
  • 3Reed R. IEEE Trans. Neural Networks, 1993, 5: 740- 747. 被引量:1
  • 4MacKay D J C.Neural Computation,1992,4:448-472. 被引量:1
  • 5Foresee F D,Hagan M T.IEEE Int Conf Neural Networks,1997,3:1930-1935. 被引量:1
  • 6Setiono R,Liu H.Neural-network Feature Selector,IEEE Trans.Neural Networks,1997,8(3):654-662. 被引量:1
  • 7Verlkas A,Bacauskiene M.Elsevier Pattern Recognition Letters,2002,23:1323~1335. 被引量:1
  • 8Chen Yaqiu, Chen Dezhao, Hu Shangxu. Generalised Error Back Propagation Training and Neural Nets for Pattern for Pattern Classification, Proceedings of the 2nd Asian Control,1997, 7:22 - 25. 被引量:1
  • 9Chen Dezhao, Chen Yaqiu, Hu Shangxu. Chemometrics and Intelligent Laboratory Systems, 1996, 35:221 - 229. 被引量:1

共引文献177

同被引文献17

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部