期刊文献+

不平衡数据集文本多分类深度学习算法 被引量:4

Text multi-classification deep learning algorithm based on unbalanced data set
下载PDF
导出
摘要 针对文本多分类算法中,由于不平衡数据集产生的小样本分类数据准确率低问题,提出基于轮廓系数动态K-means聚类的文本多分类混合式均分聚类采样算法。在不平衡数据集中针对小样本数据集利用聚类簇进行等比例过采样,针对大样本数据集利用聚类簇进行欠采样。基于微博灾害数据集,设计文本卷积神经网络,对该算法进行实验验证与分析,实验结果表明,该算法能够有效提升文本不平衡数据集的准确率和F1值,较好解决了不平衡文本数据集分类问题。 To solve the problem of low accuracy of small sample classification data generated by unbalanced data sets in the text multi classification,a hybrid average clustering sampling algorithm based on contour coefficient dynamic K-means clustering was presented.Clustering cluster was used for over sampling the small sample data set in the unbalanced data set.For large sample data set,clustering cluster was used for equal proportional undersampling.Based on micro-blog disaster data set,text convolution neural network was designed and the algorithm was verified.Experimental results indicate that the proposed algorithm can effectively improve the accuracy and F1 value of text unbalanced data set.It solves the problem of unbalanced text data set classification.
作者 王德志 梁俊艳 WANG De-zhi;LIANG Jun-yan(School of Computer Engineering,North China Institute of Science and Technology,Langfang 065201,China;Library,North China Institute of Science and Technology,Langfang 065201,China)
出处 《计算机工程与设计》 北大核心 2021年第9期2501-2508,共8页 Computer Engineering and Design
基金 国家重点研发计划基金项目(2018YFC0808306) 河北省物联网监控工程技术研究中心基金项目(3142018055)。
关键词 不平衡数据集 情感分类 文本多分类 聚类 深度学习 unbalanced data set emotion classification text multi-classification clustering deep learning
  • 相关文献

参考文献9

二级参考文献46

共引文献90

同被引文献64

引证文献4

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部