摘要
应用程序分发平台(如Google Play Store或Apple App Store)允许用户以评分或者评论等形式向下载的应用程序提交反馈.这些反馈信息可以直接或者间接地反映用户意图,及时准确地挖掘用户意图可以极大地帮助移动开发人员持续维护和改善他们的app,从而更好地满足用户期望.然而,对于很多流行的应用来说,由于其用户评论数据量大、非结构化以及评论质量不一致等,使得识别其中有价值的信息成为一项极具挑战的任务.因此,将用户评论进行自动化分析以减缓人工分析工作量成为app评论挖掘的新思路.本文我们提出了一种自动化用户评论分析方法ARICA(Automatic Review Intention Classification Analysis).首先,ARICA根据用户的评论反馈,自动将评论信息进行意图分类,并使用LDA主题模型对每个分类意图下评论进行主题划分;其次,在每个主题下对表达语义相似的评论进行聚类,进一步,ARICA使用情感分析工具SentiStrength获取用户情感,然后对用户评论的情感分布进行分析来识别用户的重要意图;最后,综合考虑用户意图和用户情感偏好等多维度信息计算用户评论得分并以此划分评论优先级,从而为开发者进行用户评论意见推荐.我们使用Google Play中真实的app评论数据来验证ARICA的评论意图分类和句子聚类的性能.实验结果表明,ARICA在用户评论意图分类过程中准确度达到80%,和现有的基于卷积神经网络的方法TextCNN的相比,ARICA的F-Measure提高了19.1%.同时,评论句子聚类过程中获得86%的准确率.另外,为了验证ARICA推荐用户评论的有效性,我们使用官方的app更新日志来实证分析ARICA推荐的用户评论建议是否可以真实地被开发者采用.结果表明,ARICA可以高效为开发者推荐具有价值信息的评论,这对于开发者进行后续的app维护和演化任务具有重要意义.
Application distribution platforms such as Google Play Store or Apple App Store allow users to submit feedbacks to download applications in the form of ratings or reviews.These feedbacks can directly or indirectly reflect users’intention,and it can greatly help mobile developers(or app provider)to continuously maintain and improve their applications,such as fix the existing bugs,add or refining the app features,etc.and so as to better satisfying user expectations continuously.App reviews provide an opportunity to proactively collect user complaints and promptly improve apps’user experience,in terms of bug fixing and feature refinement.However,for many popular applications,since the large amount of user review data,unstructured review data,and inconsistent review quality,identifying the valuable review information becomes a challenging task.Therefore,classification of user reviews into specific topics and automated analysis to reduce the workload of manual analysis has become a new idea for app review mining analysis.In this paper,we propose a method named ARICA(Automatic Review Intention Classification Analysis)to automatically analyze crowd user reviews to efficiently provide developers with software maintenance and evolution suggestions.Firstly,ARICA classifies the reviews into different categories according to the user’s feedbacks,and then uses the LDA topic model to classify the reviews under each user’s intent category.This allows a preliminary screening of user reviews to obtain review information under each intent category.Secondly,ARICA clusters user views with similar semantic expressions under each review topic to further filter the redundant information in reviews,so that can easier and intuitive to understand the user’s original feedback and capture the user’s true intention more accurately.Afterwards,ARICA uses the sentiment analysis tool called SentiStrength to obtain user sentiment,and then analyzes the sentiment distribution of user reviews to identify the user’s significant intentio
作者
肖建茂
陈世展
冯志勇
刘朋立
薛霄
XIAO Jian-Mao;CHEN Shi-Zhan;FENG Zhi-Yong;LIU Peng-Li;XUE-Xiao(Tianjin Key Laboratory of Cognitive Computing and Application,Tianjin 300350;College of Intelligence and Computing,Tianjin University,Tianjin 300350)
出处
《计算机学报》
EI
CSCD
北大核心
2020年第11期2184-2202,共19页
Chinese Journal of Computers
基金
国家自然科学基金重点基金(61832014)
国家自然科学基金(61572350)
国家重点研发计划(2017YFB1401201)资助.
关键词
用户评论
意图分类
情感分析
维护和演化
意见推荐
user reviews
intent classification
sentiment analysis
maintenance and evolution
opinion recommendations