摘要
目前,查询性能预测(predicting query performance,简称PQP)已经被认为是检索系统最重要的功能之一.近几年的研究和实验表明,PQP技术在文本检索领域有着广阔的发展前景和拓展空间.对文本检索中的PQP进行综述,重点论述其主要方法和关键技术.首先介绍了常用的实验语料和评价体系;然后介绍了影响查询性能的各方面因素;之后,按照基于检索前和检索后的分类体系概述了目前主要的PQP方法;简介了PQP在几个方面的应用;最后讨论了PQP所面临的一些挑战.
Predicting query performance (PQP) has recently been recognized by the IR (information retrieval) community as an important capability for IR systems. In recent years, research work carried out by many groups has confirmed that predicting query performance is a good method to figure out the robustness problem of the IR system and useful to give feedback to users, search engines and database creators. In this paper, the basic predicting query performance approaches for text retrieval are surveyed. The data for experiments and the methods for evaluation are introduced, the contributions of different factors to overall retrieval variability across queries are presented, the main PQP approaches are described from Pre-Retrieval to Post-Retrieval aspects, and some applications of PQP are presented. Finally, several primary challenges and open issues in PQP are summarized.
出处
《软件学报》
EI
CSCD
北大核心
2008年第2期291-300,共10页
Journal of Software
基金
Supported by the National Natural Science Foundation of China under Grant No.60603094 (国家自然科学基金)
the National Basic Research Program of China under Grant No.2004CB318109 (国家重点基础研究发展计划(973))
the Beijing Science and Technology Planning Program of China under Grant No.D0106008040291 (北京市科技计划)
关键词
信息检索
查询性能预测
information retrieval
query performance prediction