摘要
为了解决核影响属性约简算法的速度和效率等问题,提出了一种基于正区域的求核算法.采用基数排序思想计算正区域,分别得到决策属性正区域的条件属性集和除决策属性正区域的一个条件属性之外的条件属性集,并且计算这2种属性集的基数之差,以判断该条件属性是否是核属性,依次判断所有条件属性,从而快速获得所需要的核.基于正区域求核算法的时间复杂度为O(|C||U|).实验结果表明,利用该算法求核,所耗时间将随对象数的增加呈线性增长,且当对象数最大时,求核所耗时间仅为对比算法的0.6%,同时证明了该算法对各种数据集均有很好的适应性.
In order to improve the speed and efficiency of reduction algorithm of core influence attributes, a new core computation algorithm based on positive region is provided. The positive region based on radix sorting is used to get the positive region condition attributes set of decision attributes and the condition attribute set which excludes one of the condition attributes of decision attribute positive region. Then the difference between the two radices of positive regions is calculated to judge whether the condition attribute is a core attribute, thereby all the condition attributes are judged and the required core is quickly acquired. The time complexity of the proposed algorithm is O(|C|| U|). Experimental results show that the core computation time increases linearly with the increasing number of entries, and the computation time is only 0. 6% of the contrastive algorithm when the entries is maximum. Meanwhile, it is shown that the algorithm is well suitable for various kinds of data sets.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2007年第6期688-691,共4页
Journal of Xi'an Jiaotong University
基金
国家高技术研究发展计划资助项目(2006AA01Z210)
关键词
属性约简
基数排序
正区域
核
attribute reduction
radix sorting
positive region
core