期刊文献+

基于生成式对抗网络的图像自动标注 被引量:8

Automatic image annotation based on generative adversarial network
下载PDF
导出
摘要 针对基于深度学习的图像标注模型输出层神经元数目与标注词汇量成正比,导致模型结构因词汇量的变化而改变的问题,提出了结合生成式对抗网络(GAN)和Word2vec的新标注模型.首先,通过Word2vec将标注词汇映射为固定的多维词向量;其次,利用GAN构建神经网络模型——GAN-W模型,使输出层神经元数目与多维词向量维数相等,与词汇量不再相关;最后,通过对模型多次输出结果的排序来确定最终标注.GAN-W模型分别在Corel 5K和IAPRTC-12图像标注数据集上进行实验,在Corel 5K数据集上,GAN-W模型准确率、召回率和F1值比卷积神经网络回归(CNN-R)方法分别提高5、14和9个百分点;在IAPRTC-12数据集上,GAN-W模型准确率、召回率和F1值比两场K最邻近(2PKNN)模型分别提高2、6和3个百分点.实验结果表明,GAN-W模型可以解决输出神经元数目随词汇量改变的问题,同时每幅图像标注的标签数目自适应,使得该模型标注结果更加符合实际标注情形. In order to solve the problem that the number of output neurons in deep learning-based image annotation model is directly proportionate to the labeled vocabulary, which leads the change of model structure caused by the change of vocabulary, a new annotation model combining Generative Adversarial Network(GAN) and Word2 vec was proposed. Firstly, the labeled vocabulary was mapped to the fixed multidimensional word vector through Word2 vec. Secondly, a neural network model called GAN-W(GAN-Word2 vec annotation) was established based on GAN, making the number of neurons in model output layer equal to the dimension of multidimensional word vector and no longer relevant to the vocabulary. Finally, the annotation result was determined by sorting the multiple outputs of model. Experiments were conducted on the image annotation datasets Corel 5 K and IAPRTC-12. The experimental results show that on Corel 5 K dataset, the accuracy, recall and F1 value of the proposed model are increased by 5,14 and 9 percentage points respectively compared with those of Convolutional Neural Network Regression(CNN-R);on IAPRTC-12 dataset, the accuracy, recall and F1 value of the proposed model are 2,6 and 3 percentage points higher than those of Two-Pass K-Nearest Neighbor(2 PKNN). The experimental results show that GAN-W model can solve the problem of neuron number change in output layer with vocabulary. Meanwhile, the number of labels in each image is self-adaptive, making the annotation results of the proposed model more suitable for actual annotation situation.
作者 税留成 刘卫忠 冯卓明 SHUI Liucheng;LIU Weizhong;FENG Zhuoming(School of Optical and Electronic Information,Huazhong University of Science and Technology,Wuhan Hubei 430074,China)
出处 《计算机应用》 CSCD 北大核心 2019年第7期2129-2133,共5页 journal of Computer Applications
关键词 图像自动标注 深度学习 生成式对抗网络 标注向量化 迁移学习 automatic image annotation deep learning Generative Adversarial Network (GAN) label vectorization transfer learning
  • 相关文献

参考文献7

二级参考文献22

共引文献372

同被引文献42

引证文献8

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部