摘要
本文从理论上探讨了向量空间模型及其改进模型在专题文献过滤中的相关算法.概念扩充模型解决了词的同义现象,提高了召回率;潜在语义分析模型通过统计方法,提取并量化这些潜在的语义结构,进而消除同义词、多义词的影响,提高文本表示的准确性,从而使专题研究中文献过滤的召回率和准确率都有显著提高.
This article discussed the algorithms of literature filtering based on vector space model (VSM) and other improved models. Concept-expanded VSM can enhance recall through enriching profile with semantically-related terms. LSA (latent semantic analysis) is a kind of VSM, it can improve the recall and precision of literature filtering systems through extracting and representing the contextual-usage meaning of words by statistical computations applied to a large corpus of text and eliminating the influences of synonymy and polysemy.
出处
《情报学报》
CSSCI
北大核心
2005年第5期562-566,共5页
Journal of the China Society for Scientific and Technical Information
基金
中国科学院资助项目
关键词
向量空间模型
专题文献
过滤算法
潜语义
文献检索
vector space model, information filtering, arithmetic research, latent semantic analysis.