期刊文献+

基于特征相关的改进加权朴素贝叶斯分类算法 被引量:30

An Improved Weighted Naive Bayes Classification Algorithm Using Feature Correlation
下载PDF
导出
摘要 朴素贝叶斯分类算法的特征项间强独立性的假设在现实中是很难满足的.为了在一定程度上放松这一假设,提出了基于特征相关的改进加权朴素贝叶斯分类算法,该算法采用一种新的权重计算方法,这种权重计算方法是在传统词频-反文档频率(TF-IDF)权重计算基础上,考虑到特征项在类内和类间的分布情况,另外还结合特征项间的相关度,调整权重计算值,加大最能代表所属类的特征项的权重,将它称之为TF-IDF-FC权重计算.与基于传统TF-IDF权重的加权朴素贝叶斯分类算法和其他常用加权朴素贝叶斯分类算法比较,如基于属性加权的朴素贝叶斯分类算法,这种算法的分类效果均有一定的提高. The strong independence condition between the feature required by naive Bayes classification algorithm is very difficult to realize in reality. This paper puts forward an improved weighted naive naive Bayes classification algorithm using feature correlation to loose this condition to some extent, this algorithm adopts a new weighting method called TF-IDF-FC weight calculation,it takes into account the feature distribution within and between class based on the traditional TF-IDF weight calculation method and adjusts fea- ture weight in combination with feature correlation in order to make the weight of the feature which can represent its class mostly. Compared with weighted naive Bayes classification based on the traditional TF-IDF weight and other commonly used weighted naive Bayes classification algorithms, such as attribute weighted naive Bayes classification, this algorithm improve the performance of classi-fication to a certain extent.
出处 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2012年第4期682-685,共4页 Journal of Xiamen University:Natural Science
关键词 朴素贝叶斯文本分类器 加权朴素贝叶斯文本分类算法 TF—IDF权重 特征项间的相关度 naive Bayes text classification weighted naive Bayes text classification TF-IDF weight feature correlation
  • 相关文献

参考文献8

二级参考文献45

共引文献159

同被引文献241

引证文献30

二级引证文献121

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部