摘要
弱监督动作定位仅利用视频级标签信息检测动作实例的类别和时间边界,由于缺乏帧级分类标签,部分特征不明显的动作帧难以识别,且容易混淆动作帧和上下文帧。针对这两个问题,提出一种基于注意力机制上下文建模的弱监督动作定位方法。该方法在动作—背景注意力的基础上加入半软注意力,引导模型关注动作特征不明显的视频帧;通过上下文注意力对视频上下文信息建模,使模型可以区分动作帧和上下文帧。实验结果表明,所提方法的动作定位效果较好,当交并比(IoU)为0.5时,在公共数据集THUMOS14和ActivityNet1.3上的平均检测精度(mAP)分别达到32.6%和38.6%,优于现有弱监督动作定位模型。
Weakly supervised action localization detect the temporal boundaries of action instances and identify their corresponding action cat⁃egories with only video-level labels.Due to the lack of frame-level classification labels,weakly-supervised action localization has the prob⁃lems that some action frames with inconspicuous features are difficult to identify and action frames as well as context frames in videos are easily confused.To address both problems,a weakly supervised action localization method based on attention mechanism context modeling is pro⁃posed.This method added semi-soft attention on the basis of action-backgroud attention for guiding the model to focus on frames with insignifi⁃cant action features;To separate action frames and context frames,our method utilized context attention for modeling video contextual infor⁃mation.The experimental results show that our proposed method has better action localization effect.When the IoU(Intersection over Union)value is 0.5,the average detection accuracy(mAP)on the THUMOS14 and ActivityNet1.3 public datasets reach 32.6%and 38.6%respective⁃ly,which is better than existing weakly supervised action localization models.
作者
党伟超
王飞
高改梅
刘春霞
DANG Weichao;WANG Fei;GAO Gaimei;LIU Chunxia(College of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China)
出处
《软件导刊》
2023年第12期78-83,共6页
Software Guide
基金
太原科技大学博士科研启动基金项目(20202063)
太原科技大学研究生教育创新项目(SY2022063)。
关键词
弱监督
动作定位
注意力机制
半软注意力
上下文建模
weakly supervised
action localization
attention mechanism
semi-soft attention
context modeling