摘要
介绍了将开源的全文检索工具包Lucene嵌入到自己的搜索引擎中来满足开发主题搜索引擎的需求。并基于Lucene中文分词的不足设计了一个比较完善的中文分词器,然后将其引入具体应用中,并且与传统搜索引擎在性能上进行了比较。
In order to meet the require of developing thematic search engine, this paper introdued the method to embed opensource Lucene search toolkit into its own search engine. Because of the inadequacy of Chinese word segmentation based on Lucene, the paper designed a more perfect Chinese segmentation, then employed it in the application, and compared with traditional search engine in terms of performance.
出处
《微型机与应用》
2009年第19期1-3,6,共4页
Microcomputer & Its Applications
关键词
LUCENE
全文检索技术
主题搜索引擎
索引
中文分词
Lucene
full-text retrieval technology
thematic search engine
index
Chinese word segmentation