摘要
本文以豆瓣电影评论为研究对象,通过文本挖掘的相关方法,对电影消费者满意度的因素构成进行探索。首先,爬取豆瓣电影评论数据2万条,然后对预处理后的评论数据进行词频统计和词云图分析,归纳出影响观众对电影满意度的主要因素。接着用向量化处理后的文本数据,通过k-means算法的fit函数总结出电影消费者关注的主要特征因素,建立消费者满意度指标体系。然后,构建情感词典,计算各特征因素的情感得分,输出满意度得分表,并分析电影消费者的总体满意情况。最后通过建立贝叶斯网络模型,得出各满意度影响因素之间的关系和影响程度。结果表明,电影消费者满意度影响因素及其影响系数排序为:角色(0.1945)、导演(0.1693)、剧情(0.1618)、演员(0.15)、题材(0.1407)、表演(0.1254)、视听(0.0585),并从电影配置、电影设计、电影表现三个方面提出优化建议。This article takes Douban movie reviews as the research object and explores the factors influencing consumer satisfaction in movies through text mining methods. Firstly, 20,000 pieces of Douban movie review data are crawled, and then the preprocessed review data are analyzed by word frequency statistics and word cloud map to summarize the main factors that affect audience satisfaction with the movie. Then, the vectorized text data are used to summarize the main feature factors that movie consumers are concerned about through the fit function of the k-means algorithm, and establish a consumer satisfaction index system. Then, the emotion dictionary is constructed to calculate the emotional scores of each feature factor, output a satisfaction score table, and analyze the overall satisfaction of movie consumers. Finally, a Bayesian network model was established to determine the relationship and degree of influence among various satisfaction influencing factors. The results show that the factors influencing consumer satisfaction in movies and their in
出处
《统计学与应用》
2024年第4期1128-1139,共12页
Statistical and Application