摘要
文本意图识别任务中常面临训练数据不足的问题,且由于文本数据离散性导致在标签不变的条件下进行数据增强并提高原模型性能具有一定困难,为解决小样本意图识别任务中的上述问题,提出一种分步式数据增强与阶段性训练策略相结合的方法.该方法从全局和局部两个角度将原始数据在全体语句和同类别中的样本对上进行递进式增强,并在模型训练期间根据递进层次的不同划分阶段进行学习,最后在多个意图识别数据集上进行实验以评估其有效性.实验结果表明,该方法可以有效提高小样本环境中意图识别模型的准确率,同时模型的稳定性也得到了提升.
Insufficient training data is often faced in the task of text intent detection, and due to the discreteness of text data, it is difficult to perform data augmentation and improve the performance of the original model with the unchanged label. This study proposes a method combining stepwise data augmentation with a phased training strategy to solve the above problems in the few-shot intent detection. The method progressively augments the original data on whole statements and sample pairs in the same category from both global and local perspectives. During model training, the original data is learned according to different partition stages of the progressive level. Finally, experiments are performed on multiple intent detection datasets to evaluate the validity of the method. The experimental results show that the proposed method can effectively improve the accuracy and the stability of the few-shot intent detection model.
作者
李玉茹
张晓滨
LI Yu-Ru;ZHANG Xiao-Bin(School of Computer Science,Xi’an Polytechnic University,Xi’an 710048,China)
出处
《计算机系统应用》
2023年第1期406-412,共7页
Computer Systems & Applications
基金
陕西省自然科学基金(2019JQ-849)。
关键词
小样本
意图识别
数据增强
分步式
阶段性训练
few-shot
intent detection
data augmentation
stepwise
phased training