摘要
Web是动态海量信息的载体,研究的主要目标是要得到一种高性能、高可靠,支持海量网页信息搜集、分析与处理的系统结构。本文主要针对并行网页搜集系统的节点可能出现临时故障的问题,提出了一种系统动态可配置方案。该方案的基础是一种从网页URL到搜集节点的两阶段映射关系,它保证了当配置(节点数)变化时系统能经过一个短暂、安全的过渡过程达到一个新的稳态,从而保证了系统的动态可配置性。
According to Web as a massive information resource, a high performance architecture and reliable mechanism for gathering, analyzing, and processing vast amount of Web pages is the target. Aimed at the problem that nodes may occasionally fail in long crawling process, a scheme is proposed for dynamic system reconfiguration. The scheme is based on a two-phase mapping between URLs and processing nodes, which ensures that upon a change of configuration (number of nodes), and the system reaches a new steady state a short and safe transit period later.
出处
《信息技术》
2004年第7期73-75,共3页
Information Technology