期刊文献+

西安市数字方志全文检索系统的设计与实现 被引量:1

Design and Implementation of Full-Text Retrieval System for Xi'an Data Chorography
下载PDF
导出
摘要 通过Lucene API实现对PDF文档的一次全文检索,为了更精确地定位搜索关键词,设计并实现了一种新的二次索引算法,该二次索引带有关键词的页码、坐标及其上下文等信息。利用该二次索引可将检索结果定位到PDF文档的具体页,然后在页面上标示出关键字的具体位置,使对PDF文档的二次检索达到了类似Google Book的图书检索效果。系统测试结果说明系统具有良好检索性能,有较高的查全率和查准率,能够满足用户快速检索的需求。系统作为西安市数字方志全文检索平台投入使用已有2年,取得了较好的应用成果。 In the paper,it implements the fu'st index in PDF document by Lucene API. In order to locate the search keyword more accurately,this paper designs and implements a new algorithm for the second index. It contains the information about the keywords' page number, coordinates, context and so on. Which can be made used of locating the retrieval results in the specific page of the book and marking the specific positions of the keywords. Thus, the effect of the second retrieval in PDF document is as similar as Google Book. The test result proved that this system is provided with high retrieval performance, recall rate and precision rate. It can be satisfied with the requirement of quickly retrieving websites ' documents. This system has been using for 2 years as the full-text retrieval system for Xi ' an data chorography and it gets lots of application fruit.
出处 《计算机技术与发展》 2011年第10期121-124,共4页 Computer Technology and Development
基金 教育部特色专业建设点(TS11772)
关键词 全文检索 二次索引 二次检索 查全率 查准率 full-text retrieval second Index second retrieval recall precision
  • 相关文献

参考文献8

二级参考文献34

共引文献122

同被引文献1

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部