期刊文献+

基于多维小波聚类的空间文本数据情感分布分析 被引量:1

Analyzing Sentiment Distribution with Spatial-textual Data of Multi-dimensional Clustering
原文传递
导出
摘要 【目的】构建基于多维小波聚类的空间文本数据情感分析模型,实现文本情感和空间位置的综合分析。【方法】将Yelp数据集进行整合以构建空间文本数据库,使用基于词典的情感分析方法构建特征向量。提出使用多维小波聚类的混合算法和文本–空间算法两种模型并进行分析。【结果】实验结果验证了使用db2和bior2.2小波基函数的多维小波聚类算法比DBSCAN和K-means算法在空间文本数据挖掘中能识别出更精确的聚类集合,且在十万级至千万级数据聚类中速度最佳。【局限】情感分析部分使用一元语言模型,缺乏对语句层面意义的分析。【结论】本文所提文本–空间算法模型能有效挖掘多维空间文本数据的情感倾向分布;混合算法模型为空间文本数据推荐系统提供了同时计算空间接近性和情感相似性的有效方案。 [Objective] This paper builds a spatial-textual sentiment analyzing model based on multi-dimensional WaveCluster, aiming to analyze text sentiment and spatial position effectively.[Methods] First, we integrated several datasets from Yelp to build spatial-textual database. Then, we used lexicon-based sentiment analysis to generate feature vector. Third, we proposed a new method using Hybrid model, Textual-Spatial model, as well as multi-dimensional clustering model to analyze the data.[Results] We found that multi-dimensional clustering based on db2 or bior2.2 wavelet can recognize clusters more accurately than DBSCAN and K-means on spatial-textual feature mining. It also achieved the highest speed for data at 100 thousand to 10 million levels.[Limitations] We used unigram model for sentiment analysis, which cannot analyze sentences.[Conclusions] The proposed Textual-Spatial model could find out sentiment tendency distribution from spatial-textual data effectively. The Hybrid model provides a new approach for spatial-textual recommend system to calculate sentiment similarity and spatial proximity simultaneously.
作者 李柯 佐々木勇和 Li Ke;Sasaki Yuya(School of Information Management, Nanjing University, Nanjing 210046, China;Graduate School of Information Science and Technology, Osaka University, Osaka 565-0871, Japan)
出处 《数据分析与知识发现》 CSSCI CSCD 北大核心 2019年第7期14-22,共9页 Data Analysis and Knowledge Discovery
关键词 空间文本数据 情感分布分析 小波变换 聚类 Spatial-Textual Data Sentiment Distribution Analysis Wavelet Transform Clustering
  • 相关文献

参考文献3

  • 1胡卉芪..空间文本数据的量质融合与推送[D].清华大学,2016:
  • 2刘思彤..空间文本数据的查询处理技术研究[D].清华大学,2015:
  • 3周立柱,贺宇凯,王建勇.情感分析研究综述[J].计算机应用,2008,28(11):2725-2728. 被引量:73

二级参考文献28

  • 1LIU B, HU M, CHENG J. Opinion observer: Analyzing and comparing opinions on the Web[ C]// Proceedings of the 14th International Conference on World Wide Web: WWW 2005. New York: ACM Press, 2005:342 - 351. 被引量:1
  • 2PANG B, LEE L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts[ C]// Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. Morristown, N J, USA: Association for Computational Linguistics, 2004:271 -278. 被引量:1
  • 3YU H, HATZIVASSILOGLOU V. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences[ C]// Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing. Morristown, N J, USA: Association for Computational Linguistics. 2003:129 - 136. 被引量:1
  • 4WILSON T, HOFFMANN P, SOMASUNDARAN S, et al. Opinion-Finder: A system for subjectivity analysis[ C]// Proceedings of the 2005 Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. Morristown. NJ, USA: Association for Computational Linguistics. 2005: 34-35. 被引量:1
  • 5DAVE K, LAWRENCE S, DPENNOCK M. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews[ C]// Proceedings of the 12th International Conference on World Wide Web. New York: ACM Press, 2003:519-528. 被引量:1
  • 6NASUKAWA T, YI J. Sentiment analysis: Capturing favorability using natural language processing[C]//Proceedings of the 2nd International Conference on Knowledge Capture. New York: ACM Press, 2003:70-77. 被引量:1
  • 7HU M, LIU B. Mining opinion features in customer reviews[ C]// Proceedings of the 19th National Conference on Artificial Intelligence: AAAI 2004. Menlo Park, California: AAAI Press, 2004: 755 - 760. 被引量:1
  • 8HU M, LIU B. Mining and summarizing customer reviews[ C]// Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery And Data Mining. New York: ACM Press, 2004:168 - 177. 被引量:1
  • 9JINDAL N, LIU B. Identifying comparative sentences in text documents[ C]// Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2006:244-251. 被引量:1
  • 10MATSUMOTO S. TAKAMURA H, OKUMURA M. Sentiment classification using word sub-sequences and dependency sub-trees [ C]// Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining, LNCS 3518. Berlin: Springer- Verlag, 2005:301-311. 被引量:1

共引文献72

同被引文献24

引证文献1

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部