Traffic Scene Captioning with Multi-Stage Feature Enhancement

下载PDF

导出

摘要 Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images,ensuring road safety while providing an important decision-making function for sustainable transportation.In order to provide a comprehensive and reasonable description of complex traffic scenes,a traffic scene semantic captioningmodel withmulti-stage feature enhancement is proposed in this paper.In general,the model follows an encoder-decoder structure.First,multilevel granularity visual features are used for feature enhancement during the encoding process,which enables the model to learn more detailed content in the traffic scene image.Second,the scene knowledge graph is applied to the decoding process,and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again,so that themodel can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions.This paper reports extensive experiments on the challenging MS-COCO dataset,evaluated by five standard automatic evaluation metrics,and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods,especially achieving a score of 129.0 on the CIDEr-D evaluation metric,which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.

作者 Dehai Zhang Yu Ma Qing Liu Haoxing Wang Anquan Ren Jiashu Liang

机构地区 School of Software

出处《Computers, Materials & Continua》 SCIE EI 2023年第9期2901-2920,共20页 计算机、材料和连续体（英文）

基金 funded by(i)Natural Science Foundation China(NSFC)under Grant Nos.61402397,61263043,61562093 and 61663046 (ii)Open Foundation of Key Laboratory in Software Engineering of Yunnan Province:No.2020SE304.(iii)Practical Innovation Project of Yunnan University,Project Nos.2021z34,2021y128 and 2021y129.

关键词 Traffic scene captioning sustainable transportation feature enhancement encoder-decoder structure multi-level granularity scene knowledge graph

分类号 TN91 [电子电信—通信与信息系统]

引文网络
相关文献

1Yiwei WU,Shuaian WANG,Lu ZHEN,Gilbert LAPORTE.Integrating operations research into green logistics:A review[J].Frontiers of Engineering Management,2023,10(3):517-533.
2王翀,查易艺,顾颖程,宋玉,程环宇,林杉.基于Vision Transformer和语义学习的视频描述模型[J].印刷与数字媒体技术研究,2023(5):49-59. 被引量：1
3Hui-Fang Chiu,Michael Chiang,Hui-Ju Liao,You-Cheng Shen,Kamesh Venkatakrishnan,I-Shiung Cheng,Chin-Kun Wang.The ergogenic activity of cider vinegar:A randomized cross-over,double-blind,clinical trial[J].Sports Medicine and Health Science,2020,2(1):38-43.

Computers, Materials & Continua

2023年第9期

浏览历史

内容加载中请稍等...

Traffic Scene Captioning with Multi-Stage Feature Enhancement

相关作者

相关机构

相关主题

浏览历史