摘要
维基百科(Wikipedia)提供了海量的描述著名概念的高质量文章,丰富的图片使它们有更高的价值。但大部分Wikipedia文章都没有图片或图很少,为此给出了综合的框架WIMAGE来为Wikipedia文章发现高精度、高召回度和高多样性图片。WIMAGE包括生成查询的方法及两种图片排序方法。采用Wikipedia中4个常见类别的40篇文章进行实验,结果显示WIMAGE能有效地为Wikipedia文章发现高精度、高召回度以及高多样性的图片,且同时考虑了视觉相似度和文本相似度的排序方法效果最好。
Wikipedia provides plenty of human-edited articles for popular concepts in most domains. One Wikipedia article with high-diversity images is more valuable than that with no image. This paper proposes the problem of image discovery for Wikipedia articles with high precision, high recall and high diversity, and a genera/framework WIMAGE to address this problem. WIMAGE includes an approach to generate queries for different paragraphs of each Wikipedia article, and two ever-increasing methods to rank the images retrieved. This paper evaluates the effectiveness of WIMAGE using 40 Wikipedia articles from 4 popular Wikipedia categories. Experimental results show that WIMAGE is effective in discovering images for Wikipedia articles with high precision, high recall and high diversity, and the ranking method taking into account both the visual similarity and text similarity performs better.
出处
《计算机科学与探索》
CSCD
2011年第7期577-587,共11页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金No.61050009
60933004
惠普实验室创新研究计划项目No.2009-1002-2-A~~
关键词
维基百科
图片发现
多样性
图片排序
Wikipedia article
image discovery
diversity
image ranking