期刊文献+

游客微博主题情感分析方法比较研究 被引量:12

Comparison of Tourist Thematic Sentiment Analysis Methods Based on Weibo Data
下载PDF
导出
摘要 针对饮食、娱乐、购物、景观、交通和住宿6个旅游主题,基于机器学习方法,开展游客微博主题情感分析方法比较研究。以人工标注的53140条赴日游客微博为数据基础,应用两种机器学习模型开展建模实验,并分析不同特征对建模效果的影响。实验结果显示,两种模型的建模效果良好,适用于游客微博主题情感分析,其中最大熵模型效果略优于支持向量机。研究还表明,在词特征的基础上引入表情符号和主题词进行特征扩展,可以提高模型的建模效果。 Six tourism themes, diet, entertainment, shopping, view, transportation, and accommodation, are selected for thematic sentiment analysis. 53140 Weibo items published by Chinese tourists in Japan are collected and manually labeled as the case study dataset. Maximum Entropy model and Support Vector Machine are adopted. The training results are both fairly good, where the resulting Maximum Entropy model prevails slightly. It can be concluded that machine learning models are reasonably feasible in tourist thematic sentiment analysis. Moreover, the experiment also shows that the models can be improved by introducing emoticon icons and thematic words as supplements to traditional word features.
作者 刘思叶 田原 冯雨宁 庄育龙 LIU Siye;TIAN Yuan;FENG Yuning;ZHUANG Yulong(Institute of Remote Sensing and Geographical Information System,Peking University,Beijing 100871)
出处 《北京大学学报(自然科学版)》 EI CAS CSCD 北大核心 2018年第4期687-692,共6页 Acta Scientiarum Naturalium Universitatis Pekinensis
基金 国家重点研发计划(2018YFB0505500 2018YFB0505504) 测绘遥感信息工程国家重点实验室开放研究基金((16)重02)资助
关键词 主题情感分析 游客微博 最大熵模型 支持向量机 thematic sentiment analysis Weibo of tourists Maximum Entropy Support Vector Machine (SVM)
  • 相关文献

参考文献9

二级参考文献114

  • 1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:326
  • 2谢彦君.旅游体验的两极情感模型:快乐—痛苦[J].财经问题研究,2006(5):88-92. 被引量:68
  • 3唐师瑶.Q时代的交际想象——QQ表情符号的构形规律及功能初探[J].现代语文(下旬.语言研究),2006(8):82-83. 被引量:13
  • 4Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022. 被引量:1
  • 5Caol J, Li Jintao, Zhang Yongdong, et al. LDA-based Retrieval Framework for Semantic News Video Retrieval[C]//Proc. of Conf. on Semantic Computing. Irvine, California, USA: IEEE Press, 2007. 被引量:1
  • 6Steyvers M, Griffiths T. Probabilistic Topic Models[M]//Landauer T, McNamara D, Dennis S, et al. Latent Semantic Analysis: A Road to Meaning. [S. l.]: MIT Press, 2006. 被引量:1
  • 7Griffiths T, Steyvers M. Finding Scientific Topics[J]. Proceedings of the National Academy of Sciences, 2004, 101 (Suppl. 1 ): 5228-5235. 被引量:1
  • 8Nevada L V. Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation[C]//Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, 2008: 569-577. 被引量:1
  • 9Li Hang, Yamanishi K. Topic Analysis Using a Finite Mixture Model[J]. Information Processing & Management, 2003, 39(4): 521-541. 被引量:1
  • 10Liu Ying, Ciliax B J, Borges K, et al. Comparison of Two Schemes for Automatic Keyword Extraction from MEDLINE for Functional Gene Clustering[C]//Proc. of IEEE Computational Systems Bioinformatics Conference. Stanford, Califomia, USA: IEEE Press, 2004: 394-404. 被引量:1

共引文献297

同被引文献170

引证文献12

二级引证文献103

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部