期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
相关词句采集与分析研究 被引量:1
1
作者 沈阳 婵元 周子轩 《图书情报工作》 CSSCI 北大核心 2009年第22期40-43,共4页
针对目前互联网中相关词句集来源狭窄、没有对相关词的相关性判定公式进行多角度考虑和深层次理论分析等问题,实现相关词采集和分析原型,通过对相关词句集进行去重处理,并利用RSIS、RMRD和DDRW三种方法进行相关词重新排序。将相关词分... 针对目前互联网中相关词句集来源狭窄、没有对相关词的相关性判定公式进行多角度考虑和深层次理论分析等问题,实现相关词采集和分析原型,通过对相关词句集进行去重处理,并利用RSIS、RMRD和DDRW三种方法进行相关词重新排序。将相关词分成五类进行相关词特性分析,并在实证实验中对搜索引擎进行人工和机器混合评测。 展开更多
关键词 相关词 相关性 元搜索引擎 排序算法
原文传递
System of twice-gathering information and research of information fingerprint HashTrie
2
作者 沈阳 婵元 李舒晨 《Journal of Southeast University(English Edition)》 EI CAS 2008年第3期381-384,共4页
This paper presents a twice-gathering information interactive system prototype of e-government based on the condition that the Intranet and the Extranet are physical isolated.Users in the Extranet can gather links of ... This paper presents a twice-gathering information interactive system prototype of e-government based on the condition that the Intranet and the Extranet are physical isolated.Users in the Extranet can gather links of the latest related information from client software which is previously collected by web alert in the Internet.Finally,through ferry-type transport devices,information is browsed by users in the Intranet,and it is transported to a storage device and synchronized with the web platform in the Intranet.During information gathering in the Extranet and data synchronization in the Intranet,it is essential to avoid repeated gathering and copying by means of comparing the extracted information fingerprints gathered from the web pages.This prototype uses HashTrie to store information fingerprints.During testing,the structure based on HashTrie is 2.28 times faster than the Darts(double array Trie)which is the fastest structure in the existing applied patent.The existing 12 types of high speed Hash functions serving for HashTrie are also implemented.When the dictionary content is larger than 5×105 words,the PJWHash or the SuperFastHush function can be adopted;when the dictionary content is 105 words, CalcStrCR32 and ELFHash functions can be adopted. 展开更多
关键词 physical isolation twice-gathering duplicated web pages elimination information fingerprint HashTrie
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部