摘要
为了解决传统FTP Search Engine的检索时效性问题,提出了一种有针对性的数据采集更新模型。在该模型中,更新频率被设计用于解决在怎样尽可能降低服务器负载压力的前提下保证较高的平均有效下载比率的问题,而队列排序用于解决在一次数据采集更新中怎样确定FTP站点队列采集对象顺序的策略优化。
Because the traditional FTP search engines usually adopt centralized spiders to collect data, the temporal effectiveness insufficient is their major demerit. For solving this problem, an efficient data acquisition model is presented. The key technologies involve data update frequency and queue order. The data update frequency is designed to provide a balance between a good ratio of available FTP file download links and a high data acquisition frequency. The queue order is designed to optimize the order strategy of FTP sites in a data acquisition task.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第8期1853-1854,1885,共3页
Computer Engineering and Design
基金
国家863高技术研究发展计划基金项目(2006AA10Z239)
欧盟亚洲信息技术与通信基金项目(Europe Aid/117839/C/G-41-15)
江苏省高校省级重点实验室开放基金项目(2006)
关键词
FTP
搜索引擎
时效性
更新频率
队列排序
FTP
search engine
temporal effectiveness
data update frequency
queue order