摘要
为解决样本分布不均衡的连续动作序列分割识别精度不高的问题,提出一种基于深度学习的新型连续动作分割与识别模型,该模型能够从多维时间序列中提取更丰富全面的动作特征。使用基于双向长短时记忆网络(bidirectional long short-term memory networks,Bi-LSTM)的特征提取单元提取数据特征,利用基于注意力机制的特征融合模块融合多种模态的特征,并利用全连接层构建的解码器完成最终分类。实验中使用多种传感器采集了眼科手术中连续环形撕囊操作的连续动作多模态数据对算法进行验证实验。实验结果显示,与使用长短时记忆网络(LSTM)和门控循环单元(gated recurrent unit,GRU)的数据层融合算法以及4种特征层融合策略相比,所提出的模型具有更好的性能。对于数据量最小的动作类别,该算法的识别精度提高了14%以上,全局F_(1)分数提升8%以上,整体识别准确度达到90.72%。这些结果表明,该模型能够有效解决样本分布不均衡的连续动作序列分割识别精度问题,并为多模态连续动作分割与样本不均衡问题的解决提供了新的思路和方法。
To address the issue of low recognition accuracy in the segmentation and recognition of continuous action sequences caused by imbalanced sample distribution,a novel deep learning-based model for continuous action segmentation and recognition was proposed.The proposed model can extract more comprehensive action features from multi-dimensional time series data.A bidirectional long short-term memory networks(Bi-LSTM)-based feature extraction module and a feature fusion module based on attention mechanisms were employed to extract data features from different modalities and calculate fusion results.Finally,a decoder constructed using fully connected layers was used for final classification.In the experiment,the proposed model was validated using continuous action data collected from various sensors during the continuous circular capsulorhexis of ophthalmic surgery.The experimental results demonstrate that the proposed model outperforms LSTM-and gated recurrent unit(GRU)-based data layer fusion algorithms and four feature layer fusion strategies.The recognition accuracy of the smallest action category is improved by more than 14%,and the overall F_(1) score is improved by more than 8%,for an overall recognition accuracy of 90.72%.These results indicate that the proposed model is effective in addressing the issue of imbalanced sample distribution in continuous action sequence segmentation and recognition and provides new ideas and methods for solving the problem of imbalanced samples in multimodal continuous action segmentation.
作者
郑嘉颖
王杰
付攀
李桢
边桂彬
ZHENG Jia-ying;WANG Jie;FU Pan;LI Zhen;BIAN Gui-bin(School of Automation,Beijing Information Science and Technology University,Beijing 100096,China;Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)
出处
《科学技术与工程》
北大核心
2023年第29期12620-12627,共8页
Science Technology and Engineering
基金
国家自然科学基金(U20A20196)。
关键词
数据融合
不平衡数据集
动作识别
变长时间序列分割
data fusion
unbalanced dataset
action recognize
variable-length sequence segmentation