摘要
针对ML-GCN中标签共现嵌入维度过高影响模型分类性能和ML-GCN中没有充分发掘标签之间不对称关系的问题,提出一种基于图注意力网络的多标签图像分类模型ML-GAT;ML-GAT模型首先对高维标签语义嵌入矩阵进行降维;然后通过降维后的低维标签语义嵌入表示和标签类别共现图得到标签共现嵌入;与此同时ML-GAT将多标签原始图像输入卷积神经网络进行图像通用特征提取,将卷积神经网络提取出的多标签图像通用特征按照图注意力网络计算得到的标签共现嵌入的维度进行维度统一;最后ML-GAT融合标签共现嵌入和图像通用特征得到每一张多标签图像的标签预测评分;在VOC 2007与MS-COCO 2014上的实验结果表明:在训练样本充分且标签类别数足够多的情况下,ML-GAT取得了较好的实验结果,通过和其他模型比较分析,ML-GAT模型所采取的策略可以一定程度上提升模型的多标签图像分类性能。
In order to solve the problem that the high co-occurrence dimension of labels in ML-GCN reduces the model classification performance and the asymmetrical relationship between labels is not fully explored in ML-GCN,a multi-label image classification model of ML-GAT based on graph attention network is proposed.Firstly,the ML-GAT model reduces the dimensionality of the semantic embedding matrix of high dimensional labels.Then the label co-occurrence embedding is obtained by the low dimensional label semantic embedding representation and the label category co-occurrence graph after dimensionality reduction.At the same time,ML-GAT inputs the original multi-label image into the convolutional neural network to extract the general features of the image,and the general features of the multi-label image extracted by the convolutional neural network are unified in dimension according to the embedded dimensions of the labels calculated by the graph attention network.Finally,ML-GAT fusion of the image features after co-occurrence and dimensionality reduction of labels is used to obtain the label prediction score of each multi-label image.Experimental results on VOC 2007 and MS-COCO 2014 show that ML-GAT achieves good experimental results under the condition of sufficient training samples and sufficient number of label categories.By comparing with other models,the strategy adopted by ML-GAT model can improve the multi-label image classification performance of the model to a certain extent.
作者
张辉宜
张进
黄俊
ZHANG Hui-yi;ZHANG Jin;HUANG Jun(School of Computer Science and Technology, Anhui University of Technology, Anhui Maanshan 243000, China)
出处
《重庆工商大学学报(自然科学版)》
2022年第1期34-41,共8页
Journal of Chongqing Technology and Business University:Natural Science Edition
基金
国家自然科学基金项目(61806005)
安徽省高校自然科学研究重点项目(KJ2018A0050)
安徽省教育厅教学研究重点项目(2018JYXM1050).
关键词
多标签分类
图注意力网络
卷积神经网络
深度学习
multi-label classification
graph attention network
convolutional neural network
deep learning