Counting the cardinality of flows for massive high-speed traffic over sliding windows is still a challenging work under time and space constrains, but plays a key role in many network applications, such as traffic man...Counting the cardinality of flows for massive high-speed traffic over sliding windows is still a challenging work under time and space constrains, but plays a key role in many network applications, such as traffic management and routing optimization in software defined network. In this pa- per, we propose a novel data structure (called LRU-Sketch) to address the problem. The significant contributions are as follows. 1) The proposed data structure adapts a well-known probabilistic sketch to sliding window model; 2) By using the least-recently used (LRU) replacement policy, we design a highly time-efficient algorithm for timely forgetting stale information, which takes constant (O(1)) time per time slot; 3) Moreover, a further memory-reducing schema is given at a cost of very little loss of accuracy; 4) Comprehensive ex- periments, performed on two real IP trace files, confirm that the proposed schema attains high accuracy and high time efficiency.ferences including IEEE TPDS, ACM ToS, JCST, MIDDLEWARE, CLUSTER, NAS, etc. Currently, his research interests include big data management, cloud storage, and distributed file systems.展开更多
基金This work was supported by the National High Tech- nology Research and Development Program of China (2012AA01A510 and 2012AA01AS09), and partially supported by the National Natural Science Foundation of China (NSFC) (Grant Nos. 61402518, 61403060), and the Jiangsu Province Science Foundation for Youths (BK20150722).
文摘Counting the cardinality of flows for massive high-speed traffic over sliding windows is still a challenging work under time and space constrains, but plays a key role in many network applications, such as traffic management and routing optimization in software defined network. In this pa- per, we propose a novel data structure (called LRU-Sketch) to address the problem. The significant contributions are as follows. 1) The proposed data structure adapts a well-known probabilistic sketch to sliding window model; 2) By using the least-recently used (LRU) replacement policy, we design a highly time-efficient algorithm for timely forgetting stale information, which takes constant (O(1)) time per time slot; 3) Moreover, a further memory-reducing schema is given at a cost of very little loss of accuracy; 4) Comprehensive ex- periments, performed on two real IP trace files, confirm that the proposed schema attains high accuracy and high time efficiency.ferences including IEEE TPDS, ACM ToS, JCST, MIDDLEWARE, CLUSTER, NAS, etc. Currently, his research interests include big data management, cloud storage, and distributed file systems.