摘要
在采用向量空间模型表示方法的文本分类系统中,维数约简是必要的步骤,特征选择方法由于计算复杂度较低而被广泛采用。本文基于Fisher线性判别模型提出了一种新的文本特征选择算法,将其求解过程转换为一个特征项优化组合的问题,避免了复杂的矩阵变换运算。实验表明,该方法与信息增益、卡方统计方法比较,具有较明显的优势。
Dimension reducing is very important in VSM based text classification system. Feature selection is more suitable for text data because of its efficiency. A new feature selection algorithm is proposed in this paper on the basis of Fisher linear discriminant model, which converts the solution process to feature optimization problem and avoids the complex matrix operations. The experiment shows that the new algorithm has good performance and is better than IG and CHI method.
出处
《国防科技大学学报》
EI
CAS
CSCD
北大核心
2008年第5期135-138,共4页
Journal of National University of Defense Technology
基金
国家自然科学基金资助项目(70371008)