摘要
为满足技术路线图编制需要,针对模糊c均值对初始值敏感和稳定性差的缺点,通过引入遗传算法和类的概念向量,提出了一种改进的模糊均值文本聚类挖掘方法——CGFCM方法.首先根据遗传算法全局搜索的特点,CGFCM方法利用遗传算法求出文本的初始聚类中心,然后利用类的概念向量,建立概念向量矩阵,使用迭代概念向量矩阵完成文本的模糊聚类划分,实现文本聚类挖掘.最后通过实例对比,验证了CGFCM方法的挖掘效果.
Aim at the defect that the initial value of the fuzzy c-means is more sensitivity and poor stability,an improved fuzzy c-means text mining method(CGFCM) has been put forward by introducing genetic algorithm and concept vector of the class.According to the global search characteristics of genetic algorithms,firstly CGFCM uses the genetic algorithm to seek the initial clustering center of text,establishes the concept vector matrix by the concept vector of class,and completes the fuzzy clustering partition of text by the iteration concept vector matrix.At last,the improved method has been verified by using a example to compare CGFCM with FCM.
出处
《河北工业大学学报》
CAS
北大核心
2011年第3期40-44,共5页
Journal of Hebei University of Technology
基金
河北省软科学研究计划项目(094472119D-1)
关键词
文本聚类挖掘
模糊C均值
矩阵
遗传算法
text clustered mining
fuzzy c-means
matrix
genetic algorithm