摘要
针对大规模单语语料资源,提出了采用B-tree结构的二级索引机制;研究了索引及检索关键字的组织策略,引入了检索关键字的词频因素,通过关键字的分组及短语的识别策略,有效地解决了检索效率和准确率问题。
In view of large-scale single language materials resources, proposed uses B-tree thc structure two level of index mechanism;This paper has studied the index and the retrieval key words organiT.ation strategy, has introduced the retrieval word frequency factor to key words, has solved the retrieval efficiency prob- lem effectively, simultaneously, enable the retrieval through the key words grouping and the phrase recognition strategy the rate of accuracy to have the large scale enhancement.
出处
《鞍山科技大学学报》
CAS
2007年第1期40-43,共4页
Journal of Anshan University of Science and Technology
关键词
语料库
词频
二级索引
检索
corpus
word frequency
two level of index
retrieval