摘要
随着信息技术的发展,文本信息数据正在爆炸式增长,从众多的文本数据中有效地获取有用信息是一个值得研究的问题。针对该任务提出基于层次特征提取的文本分类模型,考虑文本中句子级别的语义内容以及文本级别的语义内容,依次使用两种神经网络模型建模句子级的语义内容和文本级的语义内容,从而得到关于文本的全面特征,进而基于此特征对文本进行分类。实验结果表明,该方法能够更加准确地提取文本的特征,具有更高的分类准确度。
With the development of information technology,text information data is growing explosively.How to effectively obtain useful information from numerous text data is a problem worthy of study.In view of this task,this paper proposes a text classification model based on hierarchical feature extraction.Considering the sentence-level semantics and the text-level semantics in text,two neural network models were used to model the semantic content of the sentence level and the semantic content of the text level in turn.Thereby we could get a comprehensive feature of the text and then classified the text based on this feature.The experimental results show that the proposed method can extract text feature more accurately,and has higher classification accuracy.
作者
宋岩
刘汉永
宁向南
孟宪哲
Song Yan;Liu Hanyong;Ning Xiangnan;Meng Xianzhe(Information Communication Company,State Grid Tianjin Electric Power Company,Tianjin 300000,China)
出处
《计算机应用与软件》
北大核心
2020年第2期68-72,77,共6页
Computer Applications and Software
基金
天津市科技计划项目(18ZXZNGX00310)
国网天津市电力公司科技项目(kj18-1-17)。
关键词
文本
层次特征提取
文本向量表达
文本分类
Text
Hierarchical feature extraction
Vector representation of the text
Text classification