期刊文献+

深度学习算法在藏文情感分析中的应用研究 被引量:7

Deep Learning Algorithm Applied in Tibetan Sentiment Analysis
下载PDF
导出
摘要 针对以往进行藏文情感分析时算法忽略藏文语句结构、词序等重要信息而导致结果准确率较低的问题,将深度学习领域内的递归自编码算法引入藏文情感分析中,以更深层次提取语义情感信息。将藏文分词后,用词向量表示词语,则藏文语句变为由词向量组成的矩阵;利用无监督递归自编码算法对该矩阵向量化,此时获得的最佳藏文语句向量编码融合了语义、语序等重要信息;利用藏文语句向量和其对应的情感标签,有监督地训练输出层分类器以预测藏文语句的情感倾向。在实例验证部分,探讨了不同向量维度、重构误差系数及语料库大小对算法准确度的影响,并分析了语料库大小和模型训练时间之间的关系,指出若要快速完成模型的训练,可适当减小数据集语句条数。实例验证表明,在最佳参数组合下,所提算法准确度比传统机器学习算法中性能较好的语义空间模型高约8.6%。 During Tibetan sentiment analysis in past,the algorithm always ignores some important information like sentences structure and words order etc,which lead low accuracy of sentiment analysis.To deeply get more sentiment details,this paper proposes a novel approach of Tibetan sentiment analysis based on deep learning.Firstly,one word in Tibetan is represented by a word vector while one sentence is represented by a matrix which is composed by its word vectors;Secondly,the matrix is turned into a vector which contains most important details such as sentence meaning and words order etc,through an unsupervised recursive auto encoder algorithm;Finally,the classifier in output layer is trained by supervised method which uses the word vectors and its sentiment tags.In the experiment part,this paper discusses the selection of word vector dimensions and reconstruction error weights,studies corpus amount how to affect algorithm accuracy,and analyzes the relation between corpus amount and training time.The experimental results demonstrate that the proposed method can improve accuracy up8.6%compared with semantic space model which is almost the best in traditional machine learning algorithm.
作者 普次仁 侯佳林 刘月 翟东海 PU Ciren;HOU Jialin;LIU Yue;ZHAI Donghai(Tibetan Information Technology Research Center, Tibet University, Lhasa 850000, China;School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China)
出处 《计算机科学与探索》 CSCD 北大核心 2017年第7期1122-1130,共9页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金61540060 国家软科学研究计划项目2013GXS4D150 西藏自治区科技厅科学研究项目~~
关键词 深度学习 情感分析 递归自编码 递归神经网络 deep learning sentiment analysis recursive auto encoder recursive neural networks
  • 相关文献

参考文献7

二级参考文献58

  • 1格桑居冕.藏语复句的句式[J].中国藏学,1996(1):132-141. 被引量:10
  • 2朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 3王铁琨,侯敏,杨尔弘等.中国语言生活状况报告2009(下编)[M].商务印书馆,2010. 被引量:1
  • 4唐慧丰,谭松波,程学旗.基于监督学习的中文情感分类技术比较研究[J].中文信息学报,2007,21(6):88-94. 被引量:136
  • 5B.Pang,L.Lee.Seeing stars:Exploiting class relationships for sentiment categorization with respect to rating scales[C]Proceedings of the ACL,2005:115-124. 被引量:1
  • 6Y.Bengio,R.Ducharme,P.Vincent,et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3:1137-1155. 被引量:1
  • 7Collobert R,Weston J.A unified architecture for natural language processing:Deep neural networks with multitask learning[C]//Proceedings of the 25th international conference on Machine learning.ACM,2008:160-167. 被引量:1
  • 8Mnih A,Hinton G E.A Scalable Hierarchical Distributed Language Model[C]//Proceedings of NIPS.2008::1081-1088. 被引量:1
  • 9Mikolov T,Karafiát M,Burget L,et al.Recurrent neural network based language model[C]//Proceedingsof INTERSPEECH.2010:1045-1048. 被引量:1
  • 10Mikolov T,Kombrink S,Burget L,et al.Extensions of recurrent neural network language model[C]//Proceedings of Acoustics,Speech and Signal Processing(ICASSP),2011 IEEE International Conference on.IEEE,2011:5528-5531. 被引量:1

共引文献131

同被引文献70

引证文献7

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部