摘要
针对ID3算法倾向于取值较多的属性问题,利用粗糙集理论,提出一种属性选择的新度量方法。该方法借助信息熵的性质充分考查规则集整体的确定性和不确定性问题。分析规则的三个度量标准:支持度、置信度、覆盖度,对其进行量化,并提出一规则提取算法。实例表明,该算法提高了决策分类的精度,进而能提取可靠有效的规则。
In order to solve the problem of ID3 preferring selecting attribute with multiple values,a new measurement for selection of attribute based on rough sets is proposed.With the help of the nature of information entropy,this method examines the whole certainty and uncertainty of rules set fully.After analyzing the three metrics of rules:support,confidence,coverage,this paper quantifies them and proposes an algorithm for rules extraction.The illustration results indicate that it improves the accuracy of classification of decision rules and then can extract reliable and effective rules.
出处
《计算机工程与应用》
CSCD
北大核心
2009年第14期149-151,166,共4页
Computer Engineering and Applications
基金
国家自然科学基金No.70572070
No.60674056
No.70771007~~
关键词
决策树
ID3算法
粗糙集
规则提取
数据挖掘
decesion tree
ID3 algorithm
rough sets
rules extraction
data mining