摘要
场景图是自然图像的一种结构化描述,有助于提高下游图像理解任务的性能和准确度.场景图的研究是当前计算机视觉和深度学习的重要内容,场景图生成是研究工作的重点和难点.由于数据集的长尾效应导致生成的对象关系准确性存在偏差,严重地限制了场景图的生成质量,所以无偏差场景图得以重视.在介绍视觉关系、场景图和长尾效应三个概念的基础之上,根据无偏差场景图生成流程,将现有的无偏差场景图生成分为数据平衡、无偏差训练、关系推理三种方法.进一步,对这三类方法中常见算法的优点和特点进行了总结与分析,对比了算法之间的性能.最后指出,融入外部知识、区分谓词粒度、提高小样本识别能力和构建更加平衡的数据集,是无偏差场景图生成未来研究重点.
Scene graph is a structured description of natural image,it can increase performance and accuracy in downstream image understanding task.The research on scene graph is an important domain in computer vision and deep learning,the work of scene graph generation which is the focus and difficulties.The long-tail effect of the datasets leads to bias in the accuracy of the generated object relations,which severely limits the quality of scene graph generation,so many researchers pay more attention to the unbiased scene graph.On the basis of introducing the three concepts of visual relationships,scene pictures and long-tail effects,according to the flow of scene graph generation,this survey divides the existing work into three categories which are data balance,unbiased training,relationship reasoning.Then,according to above taxonomy,this survey summarized the advantages and features of these algorithms and their performance is compared.Finally,this survey concludes that incorporating external knowledge,distinguishing predicate granularity,improving small sample recognition and building more balanced datasets will be the focus of future research on unbiased scene graph generation.
作者
康慷
杨有
张汝荟
左心悦
姜维维
Kang Kang;Yang You;Zhang Ruhui;Zuo Xinyue;Jiang Weiwei(School of Computer and Information Science,Chongqing Normal University,Chongqing 401331,China;National Center for Applied Mathematics in Chongqing,Chongqing 401331,China)
出处
《伊犁师范大学学报(自然科学版)》
2022年第3期55-66,共12页
Journal of Yili Normal University:Natural Science Edition
基金
重庆市研究生联合培养基地“重庆师范大学—重庆世纪科怡科技股份有限公司电子信息(计算机技术)研究生联合培养基地”项目
重庆师范大学(人才引进/博士启动)基金项目(21XLB032).
关键词
无偏差场景图
场景图生成
视觉关系
长尾问题
unbiased scene graph
scene graph generation
visual relationship
long-tail effect