期刊文献+

发现维基百科文章相关图片

Discovering Images for Wikipedia Articles
下载PDF
导出
摘要 维基百科(Wikipedia)提供了海量的描述著名概念的高质量文章,丰富的图片使它们有更高的价值。但大部分Wikipedia文章都没有图片或图很少,为此给出了综合的框架WIMAGE来为Wikipedia文章发现高精度、高召回度和高多样性图片。WIMAGE包括生成查询的方法及两种图片排序方法。采用Wikipedia中4个常见类别的40篇文章进行实验,结果显示WIMAGE能有效地为Wikipedia文章发现高精度、高召回度以及高多样性的图片,且同时考虑了视觉相似度和文本相似度的排序方法效果最好。 Wikipedia provides plenty of human-edited articles for popular concepts in most domains. One Wikipedia article with high-diversity images is more valuable than that with no image. This paper proposes the problem of image discovery for Wikipedia articles with high precision, high recall and high diversity, and a genera/framework WIMAGE to address this problem. WIMAGE includes an approach to generate queries for different paragraphs of each Wikipedia article, and two ever-increasing methods to rank the images retrieved. This paper evaluates the effectiveness of WIMAGE using 40 Wikipedia articles from 4 popular Wikipedia categories. Experimental results show that WIMAGE is effective in discovering images for Wikipedia articles with high precision, high recall and high diversity, and the ranking method taking into account both the visual similarity and text similarity performs better.
出处 《计算机科学与探索》 CSCD 2011年第7期577-587,共11页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金No.61050009 60933004 惠普实验室创新研究计划项目No.2009-1002-2-A~~
关键词 维基百科 图片发现 多样性 图片排序 Wikipedia article image discovery diversity image ranking
  • 相关文献

参考文献20

  • 1Taneva B, Kacimi M, Weikum G. Gathering and ranking photos of named entities with high precision, high recall, and diversity[C]//Proceedings of the 3rd ACM Interna- tional Conference on Web Search and Data Mining, 2010: 431-440. 被引量:1
  • 2Wu F, Weld D. Autonomously semantifying Wikipedia[C]// Proceedings of the 16th ACM Conference on Information and Knowledge Management. New York, NY, USA: ACM 2007: 41-50. 被引量:1
  • 3Wu F, Hoffmann R, Weld D. Information extraction from Wikipedia: moving down the long tail[C]//Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2008:731-739. 被引量:1
  • 4Joshi D, Wang J Z, Li J. The story picturing engine--a system for automatic text illustration[J]. ACM Transac- tions on Multimedia Computing, Communications, and Applications, 2006, 2(1): 68-89. 被引量:1
  • 5Fellbaum C. WordNet: an electronic lexical database[M]. Cambridge, MA: MIT Press, 1998. 被引量:1
  • 6Torralba A, Fergus R, Freeman W. 80 million tiny images: a large data set for nonparametric object and scene recog- nition[J]. 1EEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(11): 1958-1970. 被引量:1
  • 7Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009: 248-255. 被引量:1
  • 8Datta R, Joshi D, Li J, et al. Image retrieval: ideas, influ- ences, and trends of the new age[J]. ACM Computing Surveys (CSUR), 2008, 40(2): 1-60. 被引量:1
  • 9Zhou D, Bousquet O, Lal T, et al. Learning with local and global consistency[C]//Advances in Neural Information Processing Systems 16: Proceedings of NIPS. Cambridge, MA, USA: MIT Press, 2004: 321-328. 被引量:1
  • 10Zhou D, Weston J, Gretton A, et al. Ranking on data manifolds[C]//Advances in Neural Information Processing Systems 16: Proceedings of NIPS. Cambridge, MA, USA: MIT Press, 2004: 169. 被引量:1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部