摘要
在线图书评论文本数量庞大、纷繁复杂,传统词袋模型无法表征文本隐含的语义信息,也无法通过一个线性分类器实现分类,而人工监控分析又具有很强的滞后性。文中以online_shopping_10_cats数据集中的图书评论部分为语料,经过文本预处理,采用Word2vec进行文本向量表示,得到语义化的特征矩阵,引入SVM模型进行训练和预测,采用增量训练和GridSearchCV进行模型优化,应用Tkinter构建可视化界面,实现文本信息情感识别。实验表明,该系统精确率为0.94,召回率为0.94,f1-score值为0.93,具有良好的适用性。
The number of online book review texts is huge and complex. The traditional bag-of-words model cannot represent the semantic information implicit in the text,nor can it be classified by a linear classifier,and manual monitoring analysis has a strong lag. The book review part of the online_shopping_10_cats data set is used as the corpus. After text preprocessing,Word2vec is used for text vector representation to obtain a semantic feature matrix,SVM model training and prediction are introduced,and incremental training and GridSearchCV are used for model optimization,and applied Tkinter builds a visual interface to realize emotion recognition of text information. Experiments show that the model has an accuracy rate of 0.94,a recall rate of 0.94,and a f1-score value of 0.93,it has good applicability.
作者
柴源
CHAI Yuan(Library of Xi’an Aeronautical University,Xi’an 710077,China)
出处
《电子设计工程》
2022年第6期179-183,共5页
Electronic Design Engineering
基金
2020陕西省教育厅科研计划项目(20JK0199)。