期刊文献+

VFM: Visual Feedback Model for Robust Object Recognition 被引量:1

VFM: Visual Feedback Model for Robust Object Recognition
原文传递
导出
摘要 Object recognition, which consists of classification and detection, has two important attributes for robustness: 1) closeness: detection windows should be as close to object locations as possible, and 2) adaptiveness: object matching should be adaptive to object variations within an object class. It is difficult to satisfy both attributes using traditional methods which consider classification and detection separately; thus recent studies propose to combine them based on confidence contextualization and foreground modeling. However, these combinations neglect feature saliency and object structure, and biological evidence suggests that the feature saliency and object structure can be important in guiding the recognition from low level to high level. In fact, object recognition originates in the mechanism of "what" and "where" pathways in human visual systems. More importantly, these pathways have feedback to each other and exchange useful information, which may improve closeness and adaptiveness. Inspired by the visual feedback, we propose a robust object recognition framework by designing a computational visual feedback model (VFM) between classification and detection. In the "what" feedback, the feature saliency from classification is exploited to rectify detection windows for better closeness; while in the "where" feedback, object parts from detection are used to match object structure for better adaptiveness. Experimental results show that the "what" and "where" feedback is effective to improve closeness and adaptiveness for object recognition, and encouraging improvements are obtained on the challenging PASCAL VOC 2007 dataset. Object recognition, which consists of classification and detection, has two important attributes for robustness: 1) closeness: detection windows should be as close to object locations as possible, and 2) adaptiveness: object matching should be adaptive to object variations within an object class. It is difficult to satisfy both attributes using traditional methods which consider classification and detection separately; thus recent studies propose to combine them based on confidence contextualization and foreground modeling. However, these combinations neglect feature saliency and object structure, and biological evidence suggests that the feature saliency and object structure can be important in guiding the recognition from low level to high level. In fact, object recognition originates in the mechanism of "what" and "where" pathways in human visual systems. More importantly, these pathways have feedback to each other and exchange useful information, which may improve closeness and adaptiveness. Inspired by the visual feedback, we propose a robust object recognition framework by designing a computational visual feedback model (VFM) between classification and detection. In the "what" feedback, the feature saliency from classification is exploited to rectify detection windows for better closeness; while in the "where" feedback, object parts from detection are used to match object structure for better adaptiveness. Experimental results show that the "what" and "where" feedback is effective to improve closeness and adaptiveness for object recognition, and encouraging improvements are obtained on the challenging PASCAL VOC 2007 dataset.
作者 王冲 黄凯奇
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第2期325-339,共15页 计算机科学技术学报(英文版)
基金 This work was supported by the National Basic Research 973 Program of China under Grant No. 2012CB316302, the National Natural Science Foundation of China under Grant Nos. 61322209 and 61175007, the National Key Technology Research and Development Program of China under Grant No. 2012BAH07B01.Thank Steve Maybank for the revision.
关键词 object recognition object classification object detection visual feedback object recognition, object classification, object detection, visual feedback
  • 相关文献

参考文献80

  • 1Everingham M, Van Gool L, Williams C K I, Winn J, Zisserman A. The PASCAL Visual Object Classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2): 303-338. 被引量:1
  • 2Deng J, Dong W, Socher R, Li L J, Li K, Li F F. Ima- geNET: A large-scale hierarchical image database. In Proc. IEEE Computer Society Conf. Computer Vision and Pat- tern Recognition, June 2009, pp.248-255. 被引量:1
  • 3Csurka G, Dance C R , Fan L, Willamowski J, Bray C. Visual categorization with bags of keypoints. In Proc. Eu- ropean Conference on Computer Vision Workshop, May 2004, pp.145-168. 被引量:1
  • 4Yang J, Yu K, Gong Y, Huang T. Linear spatial pyramid matching using sparse coding for image classification. In Proc. IEEE Conf. Computer Vision and Pattern Recogni- tion, June 2009, pp.1794-1801. 被引量:1
  • 5Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y. Locality- constrained linear coding for image classification. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2010, pp.3360-3367. 被引量:1
  • 6Zhou X, Yu K, Zhang T, Huang T. Image classification using super-vector coding of local image descriptors. In Proc. the 11th European Conference on Computer Vision, September 2010, pp.141-154. 被引量:1
  • 7Perronnin F, Snchez J, Mensink T. Improving the fisher kernel for large-scale image classification. In Proc. the 11th European Conference on Computer Vision, Septem- ber 2010, pp.143-156. 被引量:1
  • 8Krizhevsky A, Sutskever I, Hinton G E. ImageNET classi- fication with deep convolutional neural networks. In Proc. the 26th Annual Conf. Neural Infomation Processing Sys- tems, December 2012, pp.1106-1114. 被引量:1
  • 9Chatfield K, Simonyan K, Vedaldi A, Zisserman A. Return of the devil in the details: Delving deep into convolutional nets. arXiv:1405.3531, 2014. 被引量:1
  • 10Lin M, Chen Q, Yan S. Network in network. arXiv:1312.4400, 2014. 被引量:1

同被引文献2

引证文献1

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部