二元互关联后继树精简索引模型研究被引量：2

Research on Streamline Inter-relevant Successive Trees

下载PDF

导出

摘要全文检索领域的关键问题是索引模型以及索引的创建与检索算法.基于二元互关联后继树模型,提出一个实用性能好的后继节点有序的后继树精简索引模型(SIRST),并给出此模型下索引的创建与检索算法.通过将该模型与使用广泛的倒排文件模型(IF)进行比较,表明SIRST的检索效率远远高于IF,同时,随着文本集规模越来越大,SIRST的创建效率优势愈发明显. The key question of full-text retrieval domain is the index model as well as the index building and retrieval algorithms. In this paper, a novel index model named Streamline Inter-Relevant Successive Trees （ SIRST ） is proposed, which has sorted successive node and streamline node information based on the index model of Inter-Relevant Successive Trees （ IRST）, and its building and re- trieval algorithms is presented. The performance study, comparing the cost of the index building and retrieval with the traditional Inverted Files （IF） model and SIRST under various text sets and query strings, shows that SIRST is outperforms them.

作者霍林黄俊文卢正鼎黄保华潘英花王力

机构地区华中科技大学计算机学院广西大学计算机与电子信息学院

出处《小型微型计算机系统》 CSCD 北大核心 2011年第2期286-290,共5页 Journal of Chinese Computer Systems

基金国家"八六三"高技术研究发展计划项目(2007AA01Z403)资助

关键词二元互关联后继树后继节点有序互关联后继树精简索引模型 IRST sorted successive node SIRST

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献11

1Justin Zobel, Alistair Moffat. Inverted files for text search engines [J]. ACM Computing Surveys (CSUR), 2006, 38(2). 被引量：1
2Vo Ngoc Anh, Alistair Moffat. Inverted index compression using word-aligned binary codes[J]. Information Retrieval,2005, 8( 1 ) : 151-166. 被引量：1
3Boitsov L M. Using signature hashing for approximate string matching[ J]. Computational Mathematics and Modeling,2002, 13 (3) : 314-326. 被引量：1
4申展,江宝林,陈祎,唐磊,胡运发.全文检索模型综述[J].计算机科学,2004,31(5):61-64. 被引量：12
5Manber U, Myers E. Suffix arrays: a new method for on-line suing scarches[C]. In: Proc. Of the FISTREE Ann. ACM-SIAM Syrup. on Discrete Algorithms, 1990, 319-327. 被引量：1
6Rajasekaran S, Luo J, Nick H, et al. Efficient algorithms for similarity search[J]. Journal of Combinatorial Optimization,2001, 5 (1): 125-132. 被引量：1
7Manber U, Myers G. Suffix arrays: a new method for on-line suing searches[J]. SIAM Journal on Computing. 1993, 22(5): 935-948. 被引量：1
8Tao Xiao-peng, Hu Yun-fa, Zhou Shui-geng. Subsequent array: a new full text index [ C ]. Proceeding World Multiconference on Systemics, Cybernetics and Informatics, Florida, USA, 2001:551- 556. 被引量：1
9周水庚,胡运发,关佶红.基于邻接矩阵的全文索引模型(英文)[J].软件学报,2002,13(10):1933-1942. 被引量：10
10刘学文,陶晓鹏,于玉,胡运发.一种全新的全文索引模型——后继数组模型[J].软件学报,2002,13(1):150-158. 被引量：11

二级参考文献42

1王静,孟小峰,王珊.基于区域划分的XML结构连接[J].软件学报,2004,15(5):720-729. 被引量：35
2[1]Zeng Haiquan, Shen Zhan, Hu Yunfa. Mining Sequence Pattern from Time Series Based on Inter-Relevant Successive Trees Model. In:Proc. of 9th. Intl. Conf. on Rough Sets, Fuzzy Sets,Data Mining and Granular Computing (RSFDGrC'2003), LNCS/LNAI, Spring-Verlag, Chongqing, China, 2003 被引量：1
3[3]Knuth D E. The Art of Computer Programming, Sorting and Searching. 1st edition. Addision-Wesley Pub. Co. , 1973 被引量：1
4[4]Weiner P. Linear pattern matching algorithm. In: Proc. 14th IEEE Symposium on Switching and Automata Theory, 1973.1～11 被引量：1
5[5]Manber U,Myers E. Suffix arrays: A new method for on-line string searches. In: Proc. of the FISTREE Ann. ACM-SIAM Symp. on Discrete Algorithms, 1990. 319～327 被引量：1
6[6]Hu Yunfa, Zhou Shuigeng. A New Model of Chinese Full-text databases. In: Proc. World Multiconference on Systemics,Cybernetics and Informatics, Florida, USA, 2001. 528～533 被引量：1
7[7]Tao Xiaopeng, Hu Yunfa, Zhou Shuigeng. Subsequent Array: A New Full Text Index. In: Proc. World Multiconference on Systemics, Cybernetics and Informatics, Florida, USA, 2001. 551～556 被引量：1
8[11]Zobel J, Moffat A, Ramamohanarao K. Inverted files versus signature files for text indexing. Transactions on Database Systems,1998,23(4): 453～490 被引量：1
9[12]Grossi R, Vitter J S. Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extendedabstract). STOC 2000. 397～406, 1999 被引量：1
10[14]Moffat A, Zobel J. Self-Indexing Inverted Files for Fast Text Retrieval. ACM Transactions on Information System, 1996, 14(4) :349～379 被引量：1

共引文献29

1郭琦娟,陈通照.全文检索系统中动态更新索引结构的设计与实现[J].计算机工程与科学,2006,28(z2):18-20.
2聂文琪.面向中文的全文索引模型的比较[J].武汉交通职业学院学报,2007,9(3):76-80.
3聂文琪.全文索引模型探析[J].武汉交通职业学院学报,2006,8(1):73-75.
4江华,赵建新,王海岚.PAT数组全文检索技术的研究与改进[J].现代图书情报技术,2005(8):37-41. 被引量：2
5王智强,刘建毅.一种实时更新索引结构的设计与实现[J].计算机系统应用,2005,14(10):79-82. 被引量：8
6郭琦娟,陈通照.全文检索系统中动态索引技术的研究[J].微型电脑应用,2006,22(11):11-12.
7郭琦娟,陈通照.一种动态更新索引结构的设计与实现[J].计算机系统应用,2006,15(12):76-79. 被引量：2
8郭琦娟,陈通照.全文检索系统中动态索引技术的研究[J].计算机与数字工程,2007,35(1):40-42. 被引量：2
9白秋颖,王敬成,王枞.企业信息门户访问控制安全模型的设计[J].鞍山科技大学学报,2007,30(2):155-159. 被引量：1
10雷向欣,杨智应,邵杨俊,胡运发.XML数据分页索引技术研究[J].计算机工程,2009,35(2):50-52. 被引量：3

同被引文献7

1袁春,文振焜,张基宏,钟玉琢.基于密码学的访问控制和加密安全数据库[J].电子学报,2006,34(11):2043-2046. 被引量：11
2颜文伟,胡运发.一个基于三元互关联后继树的多功能全文检索系统[J].计算机应用与软件,2007,24(2):124-126. 被引量：2
3王政华,胡运发.基于后继区间的互关联后继树搜索算法[J].计算机工程,2007,33(9):84-86. 被引量：5
4咸鹤群,冯登国.支持属性粒度数据库加密的查询重写算法[J].计算机研究与发展,2008,45(8):1307-1314. 被引量：10
5刘昆.基于时间序列数据的紧密连续频繁序列挖掘算法[J].曲靖师范学院学报,2008,27(6):60-64. 被引量：1
6杨茹,胡运发,陶晓鹏.基于双排序互关联后继树的索引压缩和原文生成算法[J].计算机应用与软件,2010,27(9):1-3. 被引量：2
7张忠平,高一博.基于三元互关联后继树的Web日志挖掘[J].计算机应用与软件,2011,28(10):50-54. 被引量：1

引证文献2

1刘昆,李颖芳,李红林.一种时序数据间断频繁项挖掘算法[J].科技视界,2013(6):25-25.
2霍林,邢霄.密文动态后继树精简索引模型研究[J].小型微型计算机系统,2013,34(7):1610-1614.

小型微型计算机系统

2011年第2期

浏览历史

内容加载中请稍等...

二元互关联后继树精简索引模型研究被引量：2

参考文献11

二级参考文献42

共引文献29

同被引文献7

引证文献2

相关作者

相关机构

相关主题

浏览历史

二元互关联后继树精简索引模型研究 被引量：2

参考文献11

二级参考文献42

共引文献29

同被引文献7

引证文献2

相关作者

相关机构

相关主题

浏览历史

二元互关联后继树精简索引模型研究被引量：2