摘要
朴素贝叶斯分类器是一种简单而有效的概率分类方法,然而其属性独立性假设在现实世界中多数不能成立。为改进其分类性能,近几年已有大量研究致力于构建能反映属性之间依赖关系的模型。本文提出一种向量相关性度量方法,特征向量属于类的的概率由向量相关度及其属性概率计算。向量相关度可通过本文给出的一个公式进行估计。实验结果表明,使用这种方法构建的分类模型其分类性能明显优于朴素贝叶斯,和其他同类算法相比也有一定提高。
Naive Bayes classifier is a simple and effective classification method based on probability theory, but its attribute independence assumption is often violated in the real world. To improve the performance of Bayes classifiers, in recent years, a great deal of research has been done on constructing models which can express dependence among attributes. This paper presented a method for measaring the correlation of a vector. The probability of a character vector belonging to a class is calculated by vector's correlation degree and the probability of its properties, and the vector correlation degree can be computed via a formula given in the paper. Experiments showed that the classifier built by this method achieved higher accuracy than NB and other similar algorithm.
出处
《情报学报》
CSSCI
北大核心
2007年第2期271-274,共4页
Journal of the China Society for Scientific and Technical Information
关键词
分类模型
贝叶斯定理
属性相关
向量相关度
classification model, Bayes theorem, attribute correlation, vector' s correlation degree