摘要
在分析网络爬虫技术与ETL技术的基础上,探讨了基于网络爬虫技术与ETL技术相融合的算法,并将该算法应用于飞机研制信息采集,实验结果表明,该算法完全满足非结构化数据采集的要求。
This paper analyzes the principle of web crawler, discusses the Fusion Algorithm of web crawler and ETL. Extraction Transformation Loading, technology.The Algorithm is applied to Information Collection during airplane development.The experimental results show that the performance of the Fusion Algorithm has met the Unstructured Information Collection.
出处
《航空科学技术》
2014年第6期43-46,共4页
Aeronautical Science & Technology
关键词
信息采集
网络爬虫
ETL
information collection
web crawler
ETL