期刊文献+

基于ERNIE-SA-DPCNN的文本分类研究--以涉网新型犯罪案件文本为例

Research on Text Classification Based on ERNIE-SA-DPCNN——Take the Text of New Network Related Crime Cases as an Example
下载PDF
导出
摘要 近年来自然语言处理领域发展迅猛,文本分类任务作为其中的基本任务出现了重大突破,但并未辐射到公安工作实务之中。目前文本分类领域以采用基于统计和概率的模型为主,但是相比于使用大量语料训练的预训练模型,其分类效果并不理想。文章采取预训练ERNIE作为特征提取模型,并以SA-Net结合ERNIE模型中的注意力机制,最后以DPCNN作为深度学习网络形成ERNIE-SA-DPCNN算法。实验证明,ERNIE-SA-DPCNN在涉网新型犯罪案件案情文本分类任务上的表现优于其他模型。 In recent years,the field of natural language processing has developed rapidly.As one of the basic tasks,text classification task has made a major breakthrough,but it has not radiated into the practice of public security work.At present,the field of text classification mainly adopts the model based on statistics and probability,but compared with the pre training model trained with a large number of corpus,its classification effect is not ideal.Pre training ERNIE is used as the feature extraction model,and SA-Net is combined with the attention mechanism in ERNIE model.Finally,DPCNN is used as the deep learning network to form ERNIE-SA-DPCNN algorithm.Experiments show that ERNIE-SA-DPCNN performs better than other models in the task of case text classification of new online crime cases.
作者 裘凯凯 丁伟杰 钟南江 QIU Kaikai;DING Weijie;ZHONG Nanjiang(Department of Computer and Information Security,Zhejiang Police College,Hangzhou 310053,China;Research Institute of Dig Dataand Network Security,Zhejiang Police College,Hangzhou 310053,China;Key Laboratory of the Ministry of Public Security for PublicSecurity Informatization Application Based on Big Data Architecture,Hangzhou 310053,China)
出处 《现代信息科技》 2022年第6期69-74,共6页 Modern Information Technology
基金 国家级高等学校大学生创新创业训练计划项目(202011483011) 浙江省公益技术研究计划项目(LGF19G010001) 公安部科技强警基础项目(2020GABJC35)。
关键词 涉网新型犯罪 文本分类 ERNIE SA-Net DPCNN new network related crime text classification ERNIE SA-Net DPCNN
  • 相关文献

参考文献8

二级参考文献37

  • 1胡云青,邱清盈,余秀,武建伟.基于改进三体训练法的半监督专利文本分类方法[J].浙江大学学报(工学版),2020,54(2):331-339. 被引量:9
  • 2D. D. Lewis. Naive (Bayes) at forty: The independence assumption in information retrieval. In: Proc. of the 10th European Conf. on Machine Learning. New York: Springer,1998, 4-15. 被引量:1
  • 3Y. Yang, X. Lin. A re-examination of text categorization methods. In: The 22nd Annual Int'l ACM SIGIR Conf. onResearch and Development in the Information Retrieval. NewYork: ACM Press, 1999. 被引量:1
  • 4Y. Yang, C. G. Chute. An example based mapping method for text categorization and retrieval. ACM Trans. on Information Systems, 1994, 12(3): 252 -277. 被引量:1
  • 5E. Wiener. A neural network approach to topic spotting. The 4th Annual Syrup. on Document Analysis and Information Retrieval,Las Vegas, NV, 1995. 被引量:1
  • 6R. E. Schapire, Y. Singer. Improved boosting algorithms using confidence-rated predications. In: Proc. of the 11th Annual Conf.on Computational Learning Theory. New York: ACM Press,1998. 80--91. 被引量:1
  • 7T. Joachims. Text categorization with support vector machines:Learning with many relevant features. In: Proc. of the 10th European Conf. on Machine Learning. New York: Springer,1998. 137-142. 被引量:1
  • 8Y. Yang. An evaluation of statistical approaches to text categorization. Information Retrieval, 1999, 1 ( 1 ) : 76-- 88. 被引量:1
  • 9R. Adwait. Maximum entropy models for natural language ambiguity resolution: [ Ph. D. dissertation ] . Pennsylvania:University of Pennsylvania, 1998. 被引量:1
  • 10R. Adwait. A maximum entropy model for part-of-speech tagging. The Empirical Methods in Natural Language Processing Conference, Philadelphia, USA, 1996. 被引量:1

共引文献127

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部