期刊文献+

关联语义结合卷积神经网络的文本分类方法 被引量:11

A Text Classification Method Based on Associative Semantics and Convolution Neural Network
下载PDF
导出
摘要 针对传统文本分类方法中没有考虑单词语义信息的问题,提出一种结合关联语义和卷积神经网络(CNN)的文本分类方法。首先,对文本进行预处理提取出词干。然后,将每个单词与其相关联的上下文单词相结合,以此构建包含语义信息的词向量。接着,将文本的词向量矩阵输入到CNN中,通过卷积层和最大池化层来获得最佳特征,通过输出层获得分类概率。最后,以最小化代价函数来训练CNN模型,以此构建最终的文本分类器。在2个中文数据集上的实验结果表明,该方法能够实现文本的准确分类,具有可行性和有效性。 For the issue that the semantic information of the word is not considered in the traditional text classification method, a text classification method combining the association semantics and convolution neural network (CNN) is proposed. Firstly, the text is pretreated to extract the stem. Then, each word is combined with its associated context word to construct a word vector containing semantic information. Then, the word vector matrix of the text is input into the CNN, and the best feature is obtained by the convolution layer and the maximum pooling layer, and the classification probability is obtained through the output layer. Finally, the CNN model is trained with a minimized cost function to construct the final text classifier. The experimental results on two Chinese datasets show that the method can achieve the accurate classification of the text, and it is feasible and effective.
作者 魏勇
出处 《控制工程》 CSCD 北大核心 2018年第2期367-370,共4页 Control Engineering of China
基金 河南省科技厅科技攻关项目(No.162102310606) 河南省教育厅资助项目(No.16A520067)
关键词 文本分类 关联语义 卷积神经网络 最大池化 Text classification associative semantics convolution neural network maximum pooling
  • 相关文献

参考文献4

二级参考文献48

  • 1郭岩,白硕,于满泉.Web使用信息挖掘综述[J].计算机科学,2005,32(1):1-7. 被引量:50
  • 2Liu W, Wang T. Online active multi-field learning for efficient email spam filtering. Knowledge and Information Systems, 2012, 33(1):117-136. [doi: 10.1007/s 10115-011-0461-x]. 被引量:1
  • 3Fumera G, Pillai I, Roli F. Spam filtering based on the analysis of text information embedded into images. Journal of Machine Learning Research, 2006,7:2699-2720. 被引量:1
  • 4Qi XG, Davison BD. Web page classification: Feature and algorithms. ACM Computing Surveys, 2009,41(2):Article 12. [doi: 10. 1145/1459352.1459357]. 被引量:1
  • 5Anotonellis I, Bouras C, Poulopoulos V. Personalized news categorization through scalable text classification. Frontiers of WWW Research and Development-APWEB, Lecture Notes in Computer Science, 2006,3841:391-401. [doi: 10.1007/11610113 35]. 被引量:1
  • 6Hu M, Liu B. Mining and summarizing customer review. In: Proc. of the ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM, 2004. 168-177. [doi: 10.1145/1014052.1014073]. 被引量:1
  • 7Kim S, Hovy E. Determining the sentiment of opinions. In: Proc. of the Int’l Conf. on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2004. [doi: 10. 3115/1220355.1220555]. 被引量:1
  • 8Schohn G, Cohn D. Less is more: Active learning with support vector machines. In: Proc. of the 17th Int’l Conf. on Machine Learning. San Francisco: Morgan Kaufmann Publishers, Inc., 2000. 839-846. 被引量:1
  • 9Liu B, Lee WS, Yu PS, Li XL. Partially supervised classification of text documents. In: Sammut C, Hoffmann AG, eds. Proc. of the 19th Int’l Conf. on Machine Learning. San Francisco: Morgan Kaufmann Publishers, Inc., 2002. 387-394. 被引量:1
  • 10Yu H, Han JW, Chang KCC. PEBL: Positive example based learning for Web page classification using SVM. In: Proc. of the Knowledge Discovery and Data Mining. New York: ACM, 2002. 239-248. [doi: 10.1145/775047.775083]. 被引量:1

共引文献58

同被引文献77

引证文献11

二级引证文献79

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部