摘要
决策树算法是数据挖掘中的一个比较活跃的研究领域,是对分类问题进行深入分析的一种方法。但构造最优决策树是一个NP难问题。首先介绍了ID3算法的基本思想,然后针对算法中存在的不足,引入了广义相关函数的概念,提出了一种以条件属性和决策属性之间的广义相关函数作为属性选择标准的决策树构造方法,并且与ID3算法进行了实验比较。实验表明,这种方法不但可以优化决策树模型,而且用该方法构造的决策树的预测精度也得到明显改善。
Decision tree is one of heated fields in data mining,and it is a widely-used solution for classification problems.But the design of the optimal decision tree has been proved to be NP-hard.This paper first introduces the main thoughts of algorithm of ID3 ,then imports the conception of general correlation function in order to make up the weakness,and puts forward an algorithm of structuring decision trees.General correlation function between conditional attributes and a decisive attribute is the criteria of attribute selection in the algorithm.What's more,a contrast to ID3 is made by experiments.Results demonstrate this algorithm not only optimizes decision trees model,but also improves classification accuracy.
出处
《计算机工程与应用》
CSCD
北大核心
2009年第10期141-143,共3页
Computer Engineering and Applications
关键词
决策树
广义相关函数
ID3算法
decision tree general correlation function ID3 algorithm