摘要
对于按照单汉字建立倒排索引的全文检索系统 ,最需要解决的问题是如何提高其存储效率和运算速度。本文针对此问题提出了以下优化方法 :一是利用参数化的Golomb编码对倒排文件进行压缩 ;二是对求集合交集的逻辑乘算法进行改进 ;三是运用并行计算和双缓冲技术。实验结果表明 ,经过优化后的单汉字全文检索系统已达到实用化的程度。
This paper discusses the optimization of full text retrieval system based on “indexing of single Chinese character” from three aspects: the compression of inverted index file using Golomb coding method, the bidirectional binary search intersection algorithm, the technique of parallel computing and double buffer cache. The experiment shows that these optimizations introduce the less storage spending and higher performance to the system.
出处
《中文信息学报》
CSCD
北大核心
2001年第4期14-19,27,共7页
Journal of Chinese Information Processing
基金
86 3高技术资助项目!(86 3 - 30 6 -ZD - 0 7- 0 2 )