摘要
针对检索效率低、查全率低以及检索精度低的问题,提出基于随机游走模型的高校图书馆文献精准检索方法。在信息熵概念基础上建立距离矩阵,结合数值属性数据欧式距离和分类属性数据距离建立混合属性数据距离矩阵,利用混合属性数据距离矩阵对高校图书馆文献实现降维处理;采用设计的游走模型锁定文献的主题词,根据词语的相似性聚类分析出具体过程,完成文献的精准检索。经过测试得知,研究方法的检索时间小于20 s,查全率达到80%以上,且具有理想的检索精度。
Aiming at the problems of low retrieval efficiency,low recall and low retrieval accuracy,this paper puts forward an accurate retrieval method of university library literature based on random walk model.Based on the concept of information entropy,the distance matrix is established,and the hybrid attribute data distance matrix is established by combining the European distance of numerical attribute data and the distance of classified attribute data.The hybrid attribute data distance matrix is used to reduce the dimension of university library documents.Use the walking model to analyze the similarity clustering and complete the accurate search of the literature.After test,the retrieval time of less than 20 s,search rate reaches more than 80% and has ideal retrieval accuracy.
作者
高萍
Gao Ping(Baoji University of Arts and Sciences,Baoji Shanxi 721006,China)
出处
《科技通报》
2022年第8期118-121,共4页
Bulletin of Science and Technology
关键词
随机游走模型
高校图书馆
数据降维
文献检索
数据距离矩阵
random walk model
university library
data dimensionality reduction
bibliography retrieval
data distance matrix