摘要
在决策树集成中,准确性和多样性都很重要,精确且多样化的决策树构成的集成系统能够提高对未知样本的分类精度.提出了一种加权Jaccard距离WJD来度量决策树的多样性,对WJD的性质进行了分析,并用基于WJD的层次聚类算法来选择集成.在UCI数据集上的对比实验表明,WJD是一种有效的多样性度量方法,基于WJD的决策树集成选择能够达到较高的预测精度.
Both accuracy and diversity are important in an ensemble of decision trees.An ensemble composed of accurate and diverse decision trees can improve the accuracy of classification for unseen samples.A new method,the Weighted Jaccard Distance(WJD),is presented to measure the diversity of decision trees,the property analysis being performed for WJD.Then we employ WJD-based hierarchical clustering to select decision trees for an ensemble.The experimental results performed on UCI datasets demonstrate that WJD is an effective diversity measure and the selected sub-ensemble based on WJD can obtain better classification accuracy.
作者
于凯
王立宏
YU Kai;WANG Li-hong(School of Computer and Control Engineering,Yantai University,Yantai 264005,China)
出处
《烟台大学学报(自然科学与工程版)》
CAS
2020年第2期204-211,共8页
Journal of Yantai University(Natural Science and Engineering Edition)
基金
国家自然科学基金资助项目(61773331,71672166),山东省高等学校科技计划资助项目(J17KA091).