摘要
针对数据集中无关的、干扰的属性会降低决策树算法性能的问题,提出了一个新的决策树算法,此算法根据对测试属性进行约简选择,提出以测试属性和决策属性的相似性作为决策树的启发规则来构建决策树,同时使用了分类阈值设定方法简化决策树的生成过程。实验证明,该算法运行效率和预测精度都优于传统的ID3算法。
The irrelevant or distracting attributes in datasets would have negative effect on decision making and lead to lower performance of classifier. In order to solve this problem, a new decision tree algorithm is proposed in this paper. It has an attribute selection on the sample data, and all the other attributes are eliminated except the most relevant attributes .Then the similar degrees of attributes between the test and the decision are computed and used as the inspiring rule to produce the decision tree. The new algorithm also uses the threshold quantity of classification to simplify the process of building the decision tree. The experiments show that the operation efficiency and the accuracy of the new algorithm are higher than the classic ID3 on some datasets.
作者
楚有斌
唐瑞春
王介强
CHU You-bin, TANG Rui-chun, Wang Jie-qiang (Ocean University of China, Qingdao 266000,China)
出处
《电脑知识与技术》
2007年第8期830-831,共2页
Computer Knowledge and Technology
关键词
决策树
ID3算法
属性相似度
Decision tree
ID3
Similar degree of attribute