摘要
近年来自然语言处理领域发展迅猛,文本分类任务作为其中的基本任务出现了重大突破,但并未辐射到公安工作实务之中。目前文本分类领域以采用基于统计和概率的模型为主,但是相比于使用大量语料训练的预训练模型,其分类效果并不理想。文章采取预训练ERNIE作为特征提取模型,并以SA-Net结合ERNIE模型中的注意力机制,最后以DPCNN作为深度学习网络形成ERNIE-SA-DPCNN算法。实验证明,ERNIE-SA-DPCNN在涉网新型犯罪案件案情文本分类任务上的表现优于其他模型。
In recent years,the field of natural language processing has developed rapidly.As one of the basic tasks,text classification task has made a major breakthrough,but it has not radiated into the practice of public security work.At present,the field of text classification mainly adopts the model based on statistics and probability,but compared with the pre training model trained with a large number of corpus,its classification effect is not ideal.Pre training ERNIE is used as the feature extraction model,and SA-Net is combined with the attention mechanism in ERNIE model.Finally,DPCNN is used as the deep learning network to form ERNIE-SA-DPCNN algorithm.Experiments show that ERNIE-SA-DPCNN performs better than other models in the task of case text classification of new online crime cases.
作者
裘凯凯
丁伟杰
钟南江
QIU Kaikai;DING Weijie;ZHONG Nanjiang(Department of Computer and Information Security,Zhejiang Police College,Hangzhou 310053,China;Research Institute of Dig Dataand Network Security,Zhejiang Police College,Hangzhou 310053,China;Key Laboratory of the Ministry of Public Security for PublicSecurity Informatization Application Based on Big Data Architecture,Hangzhou 310053,China)
出处
《现代信息科技》
2022年第6期69-74,共6页
Modern Information Technology
基金
国家级高等学校大学生创新创业训练计划项目(202011483011)
浙江省公益技术研究计划项目(LGF19G010001)
公安部科技强警基础项目(2020GABJC35)。