摘要
浅析了当前藏文数据采集与检索存在的问题。基于元搜索技术提出了一种藏文信息采集方案;基于全文检索工具包Lucene并针对藏文的特点,提出了一种藏文信息的索引、检索的设计方案,并对其关键技术进行了探讨。实际系统应用证明方案可行。介绍的数据采集、索引、检索方案同样适用于藏文外的其他语种。
With the increasing number of Tibetan information on the Internet,how to effectively collect data collection and make full-text search in the fieldof Tibetan Language require further studies.This paper presents a meta-search technology based Tibetan Information Collection program,and for the Tibetan language,with the full-text search tool kit Lucene,A Tibetan data index,search design was proposed and the key technologies of its implementation are discussed.
出处
《电脑开发与应用》
2011年第2期34-37,共4页
Computer Development & Applications
关键词
信息采集
LUCENE
藏文全文检索
information collection
Lucene
Tibetan full-text information retrieval