期刊文献+

WMA:A Multi-Scale Self-Attention Feature Extraction Network Based on Weight Sharing for VQA 被引量:1

下载PDF
导出
摘要 Visual Question Answering(VQA)has attracted extensive research focus and has become a hot topic in deep learning recently.The development of computer vision and natural language processing technology has contributed to the advancement of this research area.Key solutions to improve the performance of VQA system exist in feature extraction,multimodal fusion,and answer prediction modules.There exists an unsolved issue in the popular VQA image feature extraction module that extracts the fine-grained features from objects of different scale difficultly.In this paper,a novel feature extraction network that combines multi-scale convolution and self-attention branches to solve the above problem is designed.Our approach achieves the state-of-the-art performance of a single model on Pascal VOC 2012,VQA 1.0,and VQA 2.0 datasets.
出处 《Journal on Big Data》 2021年第3期111-118,共8页 大数据杂志(英文)
基金 This work is supported by the National Natural Science Foundation of China(61872231,61701297).
  • 相关文献

二级参考文献1

共引文献5

同被引文献12

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部