摘要
豆瓣是一个通过提供书籍影视相关内容发展起来的网站,能够提供电影的各类信息。豆瓣用户的评论有时能引领一代新的风尚潮流。文章使用Python语言结合有关爬虫的知识设计了有关豆瓣影视短评的爬取系统,采用了URL管理器、网页结构分析、数据采集、数据清洗、数据分析、数据可视化等模块,将指定的电影影评内容保存,精准的获取不同电影的被喜爱程度以及电影上映后带来的反响。
Douban is website that is gradually developed through providing books,film and television related content,it can provide different kinds of information about film.Sometimes,Douban users’comments can lead a generation of new fashion trend.In this paper,using Python language and combining with the knowledge on crawlers to design a crawling system about Douban film and television short commentary,which adopts the following modules such as URL manager,webpage structure analysis,data collection,data cleaning,data analysis and data visualization etc to save the specified film review content,so as to accurately obtain the popularity extent of the different films and response produced after the films’showing.
作者
高雨菲
毛红霞
GAO Yufei;MAO Hongxia(School of Computer and Software,Jincheng College of Sichuan University,Chengdu 611731,China)
出处
《现代信息科技》
2020年第24期10-12,16,共4页
Modern Information Technology