摘要
针对中文短文本内容稀疏、上下文信息跨度大的问题,为进行有效的短文本情感分类,基于评论性短文本特征,使用预训练字向量,以字为单位输入模型来提高数据集的泛化性。使用多种经典深度学习分类模型验证基于字的短文本在外卖评论数据下的分类情况。实验结果表明,各模型均能准确判断短文本的情感倾向,检验了字向量的可行性及模型在情感分析方面的效果。各模型在基于字的评论短文本中的泛化性也为将来迁移学习和更深入的研究提供了价值参考。
Aiming at the problems of sparse content and large span of contextual information in Chinese short texts,in order to achieve more effective sentiment analysis of short texts,based on the characteristics of critical short texts,pre-trained word vectors are used to input the model in units of words to improve the generalization of data sets.Using a variety of classic deep learning classification models to verify the classification of word-based short texts under the data of take-away reviews.The experimental results show that each model can accurately judge the sentimental tendency of short texts,and test the feasibility of word vectors and the effect of the model in sentiment analysis.The generalization of each model in word-based review short texts also provides valuable reference for future transfer learning and further research.
作者
贾钰峰
李容
章蓬伟
邵小青
JIA Yufeng;LI Rong;ZHANG Pengwei;SHAO Xiaoqing(School of Information Science and Engineering,Xinjiang University of Science and Technology,Korla 841300,China;School of Marxism,Xinjiang University of Science and Technology,Korla 841300,China)
出处
《微处理机》
2023年第6期40-45,共6页
Microprocessors
基金
2021年度校级科研普通项目(2021-KYPT11)。
关键词
字向量
短文本
情感分类
Word vector
Short text
Sentiment analysis