摘要
潜在语义索引(LSI)通过奇异值分解(SVD)获得原始词—文档矩阵的潜在语义结构,在一定程度上解决了一词多义和多词一义问题。但目前文本分类中使用LSI方法的效果并不理想,这是因为没有充分考虑分类信息。为解决该问题,提出一种改进的局部潜在语义索引(LLSI)方法,使用支持向量机(SVM)来产生局部区域。实验结果表明,该方法是有效的。
Latent Semantic Indexing (LSI) uses Singular Value Decomposition (SVD) to obtain latent semantic structure of original term-document matrix, and problems of polysemy and synonymy can be dealt with to some extent. However, the present available methods of applying LSI to text classification are not satisfying, since they do not take full account of classification information. To solve the problem, an improved Local LSI (LLSI) method was proposed, using Support Vector Machine (SVM) to produce the local region. Experimental results suggest that the proposed method is effective.
出处
《计算机应用》
CSCD
北大核心
2007年第6期1382-1384,共3页
journal of Computer Applications
基金
甘肃省科技攻关计划资助项目(2GS047-A52-002-03)
关键词
文本分类
潜在语义索引
支持向量机
局部区域
text classification
Latent Semantic Indexing (LSI)
Support Vector Machine (SVM)
local region