期刊文献+

一种基于属性相关的C4.5决策树改进算法 被引量:13

An Improved Algorithm of C4. 5 Decision Tree Based on Attributes Correlation
下载PDF
导出
摘要 针对在C4.5决策树构造过程中,测试属性选择未考虑属性之间影响的缺点,提出了一种改进的C4.5决策算法.该算法使用一个属性与其他属性的平均信息熵表示这个属性与其他属性的冗余度,然后在选择测试属性的过程中,加入测试属性与其他属性的冗余度,通过信息增益、分裂熵和冗余度三个因素的评价,选择信息增益率高而与其他属性冗余度低的测试属性.实验结果表明,在选定的实验数据集上,改进后的C4.5决策树算法平均分类正确率提高. In view of the disadvantage that the chose of test attribute don't consider the interaction between the attributes in the construction process of CA. 5 decision tree, an improved C4.5 decision algorithm was pro- posed. Redundancy of the test attribute with other attributes was represented by average information gain. Then redundancy of the test attribute with other attributes was added to the algorithm. The algorithm select- ed the test attribute with high information gain ratio and low redundancy by information gain, split entropy and redundancy three evaluation factors. The experimental results illustrate that the improved C4.5 decision tree algorithm increases average classification accuracy on selected experimental data sets.
作者 魏浩 丁要军
出处 《中北大学学报(自然科学版)》 CAS 北大核心 2014年第4期402-406,共5页 Journal of North University of China(Natural Science Edition)
关键词 C4 5决策树 属性相关 信息熵 信息增益率 冗余度 CA. 5 decision tree attributes correlation information entropy information gain ratio redundancy
  • 相关文献

参考文献14

二级参考文献72

共引文献165

同被引文献99

引证文献13

二级引证文献67

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部