摘要
查询结果缓存可以对查询结果的文档标识符集合或者实际的返回页面进行缓存,以提高用户查询的响应速度,相应的缓存形式可以分别称之为标识符缓存或页面缓存。对于固定大小的内存,标识符缓存可以获得更高的命中率,而页面缓存可以达到更高的响应速度。该文根据用户查询访问的时间局部性和空间局部性,提出了一种新颖的基于时空局部性的层次化结果缓存机制。首先,该机制将固定大小的结果缓存划分为两层:页面缓存和标识符缓存。对于用户提交的查询,该机制会首先使用第一层的页面缓存进行应答,如果未能命中,则继续尝试使用第二层的标识符缓存。实验显示这种层次化的缓存机制较传统的仅依赖于单一缓存形式的机制,在平均查询响应时间上,取得了可观的性能提升:例如,相对单纯的页面缓存,平均达到9%,最好情况下达到11%。其次,该机制在标识符缓存的基础上,设计了一种启发式的预取策略,对用户查询检索的空间局部性进行挖掘。实验显示,这种预取策略的融合,能进一步促进检索系统性能的有效提升,从而最终建立起一套时空完备的、有效的结果缓存机制。
In a result cache,either document identifiers(docID cache)or the actual HTML pages(page cache)can be stored to accelerate the response speed.For a fixed memory size,the docID cache can achieve a higher hit ratio while the page cache can obtain higher response speed.This paper proposes a novel hierarchical result caching scheme based on temporal and spatial locality,in which the result cache is firstly split into two layers:apage cache and a docID cache.In our scheme,page cache will be the first choice for answering some queries,and then the docID cache.In terms of average query response time,the results show that the proposed approach achieves a substantial performance improvement than baseline method by 9% on average,and up to 11%in the best situation.Secondly,the scheme also designs an adaptive prefetching strategy based on docID cache.The experiments show that the proposed scheme combined with the prefetching strategy can lead to an additional performance improvement.And we finally build a complete and effective result caching scheme by capturing the temporal and spatial locality of user search behaviours.
出处
《中文信息学报》
CSCD
北大核心
2016年第1期63-70 78,共9页
Journal of Chinese Information Processing
基金
国家973计划(2014CB340401
2012CB316303)
国家863计划(2014AA015204)
国家自然科学基金(61472401
61433014
61425016
61203298
61572473)