期刊文献+

一种基于用户评论自动分析的APP维护和演化方法 被引量:6

An Automatic Analysis of User Reviews Method for APP Evolution and Maintenance
下载PDF
导出
摘要 应用程序分发平台(如Google Play Store或Apple App Store)允许用户以评分或者评论等形式向下载的应用程序提交反馈.这些反馈信息可以直接或者间接地反映用户意图,及时准确地挖掘用户意图可以极大地帮助移动开发人员持续维护和改善他们的app,从而更好地满足用户期望.然而,对于很多流行的应用来说,由于其用户评论数据量大、非结构化以及评论质量不一致等,使得识别其中有价值的信息成为一项极具挑战的任务.因此,将用户评论进行自动化分析以减缓人工分析工作量成为app评论挖掘的新思路.本文我们提出了一种自动化用户评论分析方法ARICA(Automatic Review Intention Classification Analysis).首先,ARICA根据用户的评论反馈,自动将评论信息进行意图分类,并使用LDA主题模型对每个分类意图下评论进行主题划分;其次,在每个主题下对表达语义相似的评论进行聚类,进一步,ARICA使用情感分析工具SentiStrength获取用户情感,然后对用户评论的情感分布进行分析来识别用户的重要意图;最后,综合考虑用户意图和用户情感偏好等多维度信息计算用户评论得分并以此划分评论优先级,从而为开发者进行用户评论意见推荐.我们使用Google Play中真实的app评论数据来验证ARICA的评论意图分类和句子聚类的性能.实验结果表明,ARICA在用户评论意图分类过程中准确度达到80%,和现有的基于卷积神经网络的方法TextCNN的相比,ARICA的F-Measure提高了19.1%.同时,评论句子聚类过程中获得86%的准确率.另外,为了验证ARICA推荐用户评论的有效性,我们使用官方的app更新日志来实证分析ARICA推荐的用户评论建议是否可以真实地被开发者采用.结果表明,ARICA可以高效为开发者推荐具有价值信息的评论,这对于开发者进行后续的app维护和演化任务具有重要意义. Application distribution platforms such as Google Play Store or Apple App Store allow users to submit feedbacks to download applications in the form of ratings or reviews.These feedbacks can directly or indirectly reflect users’intention,and it can greatly help mobile developers(or app provider)to continuously maintain and improve their applications,such as fix the existing bugs,add or refining the app features,etc.and so as to better satisfying user expectations continuously.App reviews provide an opportunity to proactively collect user complaints and promptly improve apps’user experience,in terms of bug fixing and feature refinement.However,for many popular applications,since the large amount of user review data,unstructured review data,and inconsistent review quality,identifying the valuable review information becomes a challenging task.Therefore,classification of user reviews into specific topics and automated analysis to reduce the workload of manual analysis has become a new idea for app review mining analysis.In this paper,we propose a method named ARICA(Automatic Review Intention Classification Analysis)to automatically analyze crowd user reviews to efficiently provide developers with software maintenance and evolution suggestions.Firstly,ARICA classifies the reviews into different categories according to the user’s feedbacks,and then uses the LDA topic model to classify the reviews under each user’s intent category.This allows a preliminary screening of user reviews to obtain review information under each intent category.Secondly,ARICA clusters user views with similar semantic expressions under each review topic to further filter the redundant information in reviews,so that can easier and intuitive to understand the user’s original feedback and capture the user’s true intention more accurately.Afterwards,ARICA uses the sentiment analysis tool called SentiStrength to obtain user sentiment,and then analyzes the sentiment distribution of user reviews to identify the user’s significant intentio
作者 肖建茂 陈世展 冯志勇 刘朋立 薛霄 XIAO Jian-Mao;CHEN Shi-Zhan;FENG Zhi-Yong;LIU Peng-Li;XUE-Xiao(Tianjin Key Laboratory of Cognitive Computing and Application,Tianjin 300350;College of Intelligence and Computing,Tianjin University,Tianjin 300350)
出处 《计算机学报》 EI CSCD 北大核心 2020年第11期2184-2202,共19页 Chinese Journal of Computers
基金 国家自然科学基金重点基金(61832014) 国家自然科学基金(61572350) 国家重点研发计划(2017YFB1401201)资助.
关键词 用户评论 意图分类 情感分析 维护和演化 意见推荐 user reviews intent classification sentiment analysis maintenance and evolution opinion recommendations
  • 相关文献

参考文献1

二级参考文献34

  • 1Deerwester S C, Dumais S T, Landauer T K, et al. Indexing by latent semantic analysis [J]. Journal of the Association of Information Sience, 1990, 41(6) : 391-407. 被引量:1
  • 2Song Y, Wang H, Wang Z, et al. Short text conceptualization using a probabilistic knowledgebase [C]// Proc of the 22nd Int Joint Conf on Artificial Intelligence (IJCAI). Palo Alto, CA: AAAI, 2011:2330-2336. 被引量:1
  • 3Wang Z, Zhao K, Wang H, et al. Query understanding through knowledge-based conceptualization [C]//Proc of the 24th Int Joint Conf on Artificial Intelligence (IJCAI). Palo Alto, CA: AAAI, 2015:3264-3270. 被引量:1
  • 4Lund K, Burgess C. Producing high-dimensional semantic spaces from lexical co-occurrence[J]. Behavior Research Methods, Instruments,& Computers, 1996, 28(2): 203- 2O8. 被引量:1
  • 5Turney P D, Pantel P. From frequency to meaning: Vector space models of semantics [J]. Journal of Artificial Intelligence Research, 2010, 37(1): 141-188. 被引量:1
  • 6Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model [J]. The Journal of Machine Learning Research, 2003, 3(2): 1137-1155. 被引量:1
  • 7Mikolov T, Karafiat M, Burget L, et al. Recurrent neural network based language model [C] //Proc of the llth Annual Conf of the Int Speech Communication Association. New York: ACM, 2010: 1045-1048. 被引量:1
  • 8Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space [J]. Computing Research Repository, 2013 [2015-12-30]. http://arxiv, org/ pdf/1301. 3781. pdf. 被引量:1
  • 9Collobert R, Weston J. A unified architecture for natural language processing: Deep neural networks with multitask learning [C]//Proc of the 25th Int Conf on Machine Learning (ICML). New York: ACM, 2008:160-167. 被引量:1
  • 10Ire Q V, Mikolov T. Distributed representations of sentences and documents [C]//Proc of the 31st Int Conf on Machine Learning(ICML). PaloAlto, CA: AAAI, 2014:1188-1196. 被引量:1

共引文献49

同被引文献44

引证文献6

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部