摘要
全文检索领域的关键问题是索引模型以及索引的创建与检索算法.基于二元互关联后继树模型,提出一个实用性能好的后继节点有序的后继树精简索引模型(SIRST),并给出此模型下索引的创建与检索算法.通过将该模型与使用广泛的倒排文件模型(IF)进行比较,表明SIRST的检索效率远远高于IF,同时,随着文本集规模越来越大,SIRST的创建效率优势愈发明显.
The key question of full-text retrieval domain is the index model as well as the index building and retrieval algorithms. In this paper, a novel index model named Streamline Inter-Relevant Successive Trees ( SIRST ) is proposed, which has sorted successive node and streamline node information based on the index model of Inter-Relevant Successive Trees ( IRST), and its building and re- trieval algorithms is presented. The performance study, comparing the cost of the index building and retrieval with the traditional Inverted Files (IF) model and SIRST under various text sets and query strings, shows that SIRST is outperforms them.
出处
《小型微型计算机系统》
CSCD
北大核心
2011年第2期286-290,共5页
Journal of Chinese Computer Systems
基金
国家"八六三"高技术研究发展计划项目(2007AA01Z403)资助