摘要
使用模块化构建了一种特定领域的Web信息集成系统,设计实现一种基于领域关键词的新闻、微博数据采集处理系统,通过用户提供的关键词,结合人工筛选进行关键词扩展,对全网相关新闻、微博数据进行采集与抽取。设计实现了一种基于关键词和转发数的新闻排序方法,对特定领域采集的新闻数据进行处理排序,遴选重要信息进行定向推送。以气候变化领域为例,设计了Web信息集成系统。
A Web information integration system for a specific field was constructed with modularization. A data acquisition and processing system of news and microblog based on field keywords was designed and implemented,which can acquire and extract the related news of the whole network and microblog data in combination with the keywords provided by the user and keywords extension with manual screening. A news sorting method based on keywords and forwarding quantity was designed and implemented to sort the news data acquired in a specific field and select the important information for pushing directionally. The Web information integration system was designed by taking the field of climate change as an example.
出处
《现代电子技术》
北大核心
2016年第11期125-128,共4页
Modern Electronics Technique
关键词
WEB信息集成
微博数据采集
气候变化
信息推送
Web information integration
microblog data acquisition
climate change
information push