摘要
文献的向量表示方法对文献聚合、聚类和分类等研究是重要的。本文在向量空间模型(VSM)的基础上,通过补充文献特征此间的潜在语义相关性,提出了潜在语义向量空间模型(CLSVSM),并采用CNKI的学科分类文献为样本进行实验检验,结果显示新模型在文献聚类效果上明显好于VSM模型。
The vector representation of Literature is important for the literature aggregation, clustering and classification. The paper introduced CLSVSM based on the VSM, embedding the co-occurrence latent semantic correlation between the literatures keywords to their representative vector. So the CLSVSM is a high dimensional vector space model. Afterwards experiments were done to test the performance of new model relative to the VSM, adopt the literatures form CNKI. The results showed that CLSVSM is better than VSM model in the literature clustering testing.
出处
《情报学报》
CSSCI
北大核心
2014年第10期1041-1045,共5页
Journal of the China Society for Scientific and Technical Information
基金
本文系国家社科基金重大项目“基于语义的馆藏资源深度聚合与可视化展示研究”(批准号:11&ZD152)的研究成果之一.
关键词
数字文献资源
高维向量
聚类
VSM
CLSVSM
digital literatures resources
high dimensional vector
clustering
VSM
CLSVSM