摘要
当深度和节点个数超过一定规模后,决策树对未知实例的分类准确率会随着规模的增大而逐渐降低,需要在保证分类正确率的前提下,用剪枝算法对减小决策树的规模。论文在对现有决策树剪枝算法优缺点进行分析的基础上,提出了一种综合考虑分类精度、分类稳定性以及决策树规模的后剪枝改进算法,并通过实验证明了该算法在保证模型判别精度和稳定性的前提下,可以有效地减小了决策树的规模,使得最终的自动判别模型更加简洁。
The classification accuracy of a decision tree would be lower when the depth and the nodes exceed a certain size.So it's necessary to reduce the scale of decision tree by using apruning algorithm and ensure the accuracy of classification at the same time.To solve this problem,a kind of post-pruning strategy which evenly considers classification accuracy,classification stability,and the scale of decision tree is proposed on the basis of in-depth study of the existing decision tree pruning algorithm.Experimental results show that this improved post-pruning algorithm can effectively reduce the size of the decision tree,ensure the accuracy and stability,and make the final model more compact.
出处
《计算机与数字工程》
2015年第6期960-966,971,共8页
Computer & Digital Engineering
关键词
分类算法
决策树
剪枝算法
classification algorithm, decision tree, pruning algorithm