摘要
面对生活中数据信息量大的问题,常使用粗糙集对数据进行知识约简,消除数据中冗余的部分。但大多数研究没有考虑约简后对原有分类的影响;常使用的朴素贝叶斯算法又难以获得其先验概率。基于上述问题,本文提出了一种基于粗糙集的贝叶斯分类算法:首先利用粗糙集中决策属性和条件属性之间的依赖关系,进行属性约简,消除冗余的数据,然后通过贝叶斯算法对约简后的数据进行知识挖掘,最后通过对故障源数据的对比分析。该方法既避开了朴素贝叶斯算法对先验概率的要求,又使得数据分类和预测能力有了明显提升。
With the problem of vast data information in the life , people usually simplify the data for eliminating the redundant part by the rough set.But most studies did not consider the effect of reduction on the original classification .The naive Bayesian method is hard to obtain the prior probability .Based on the above problems, a method of Bayesian network based on the rough set is proposed in this paper.Firstly, we reduce the attribution for eliminating the redundant data by the dependencies between deci-sion attribution and conditional attribution in rough set .Then, we mine knowledge from the simplified data by the method of naive Bayesian network.Finally, we compare the data with the original system's one and find that the method improves the accuracy well.It solves the problems of the traditional na?ve Bayesian hard to obtain the prior probability and require the conditional inde-pendences between each characteristic property, and improves the ability of data mining evidently.
出处
《安庆师范学院学报(自然科学版)》
2014年第1期36-40,共5页
Journal of Anqing Teachers College(Natural Science Edition)
关键词
粗糙集
数据挖掘
故障源
rough set
data mining
fault source