摘要
提出一种基于连续属性离散化的知识分类方法.将条件属性按照重要度由高到低排序,并依照此排序将决策表中各条件属性依次离散化.在对决策表中条件属性的离散化过程中充分考虑已离散化的条件属性及决策属性,离散后的决策表不需要进一步约简.使用了模拟数据和UCI机器学习数据集中的数据进行算法测试,而且与其他离散化算法进行了对比,结果充分证明了新方法的有效性.
This paper gives a new method of classification based on discretization of continuous attributes.Firstly condition attributes are sorted in descending order by their significance,and then each condition attribute in the decision table is discretized in sequence by the order.Both discretized condition attributes and decision attributes are paid more attention during the course of discretization.And the discretized decision table needs not to be reduced further.Finally,the simulation data and the UCI machine learning data are used to verify the new method,and the new method is compared with other discretization algorithms.The results fully show the correctness and effectiveness of the proposed method of classification based on discretization of continuous attributes.
出处
《东北师大学报(自然科学版)》
CAS
CSCD
北大核心
2012年第1期45-49,共5页
Journal of Northeast Normal University(Natural Science Edition)
基金
国家自然科学基金资助项目(60673099
60873146)
吉林省科技发展计划项目(201105056)
吉林省教育厅科技计划基金资助项目(2007172
2010383)
长春师范学院校内青年基金资助项目(010
012)
关键词
粗糙集
离散化
属性重要度
区间划分
断点
rough set
discretization
significance of attributes
region division
breakpoint