期刊文献+

Efficient Feature Extraction Using Apache Spark for Network Behavior Anomaly Detection 被引量:2

Efficient Feature Extraction Using Apache Spark for Network Behavior Anomaly Detection
原文传递
导出
摘要 Extracting and analyzing network traffic feature is fundamental in the design and implementation of network behavior anomaly detection methods. The traditional network traffic feature method focuses on the statistical features of traffic volume. However, this approach is not sufficient to reflect the communication pattern features. A different approach is required to detect anomalous behaviors that do not exhibit traffic volume changes, such as low-intensity anomalous behaviors caused by Denial of Service/Distributed Denial of Service (DoS/DDoS) attacks, Internet worms and scanning, and BotNets. We propose an efficient traffic feature extraction architecture based on our proposed approach, which combines the benefit of traffic volume features and network communication pattern features. This method can detect low-intensity anomalous network behaviors and conventional traffic volume anomalies. We implemented our approach on Spark Streaming and validated our feature set using labelled real-world dataset collected from the Sichuan University campus network. Our results demonstrate that the traffic feature extraction approach is efficient in detecting both traffic variations and communication structure changes. Based on our evaluation of the MIT-DRAPA dataset, the same detection approach utilizes traffic volume features with detection precision of 82.3% and communication pattern features with detection precision of 89.9%. Our proposed feature set improves precision by 94%. Extracting and analyzing network traffic feature is fundamental in the design and implementation of network behavior anomaly detection methods. The traditional network traffic feature method focuses on the statistical features of traffic volume. However, this approach is not sufficient to reflect the communication pattern features. A different approach is required to detect anomalous behaviors that do not exhibit traffic volume changes, such as low-intensity anomalous behaviors caused by Denial of Service/Distributed Denial of Service (DoS/DDoS) attacks, Internet worms and scanning, and BotNets. We propose an efficient traffic feature extraction architecture based on our proposed approach, which combines the benefit of traffic volume features and network communication pattern features. This method can detect low-intensity anomalous network behaviors and conventional traffic volume anomalies. We implemented our approach on Spark Streaming and validated our feature set using labelled real-world dataset collected from the Sichuan University campus network. Our results demonstrate that the traffic feature extraction approach is efficient in detecting both traffic variations and communication structure changes. Based on our evaluation of the MIT-DRAPA dataset, the same detection approach utilizes traffic volume features with detection precision of 82.3% and communication pattern features with detection precision of 89.9%. Our proposed feature set improves precision by 94%.
出处 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2018年第5期561-573,共13页 清华大学学报(自然科学版(英文版)
基金 supported by the National Natural Science Foundation of China (No. 61272447) Sichuan Province Science and Technology Planning (Nos. 2016GZ0042, 16ZHSF0483, and 2017GZ0168) Key Research Project of Sichuan Provincial Department of Education (Nos. 17ZA0238 and 17ZA0200) Scientific Research Staring Foundation for Young Teachers of Sichuan University (No. 2015SCU11079)
关键词 feature extraction graph theory network behavior anomaly detection Apache Spark feature extraction graph theory network behavior anomaly detection Apache Spark
  • 相关文献

参考文献1

二级参考文献30

  • 1AZZOUNA N B, GUILLEMIN F. Impact of peer-to-peer applications on wide area network traffic [C]//Proc of IEEE Global Telecommunications Conference. Texas: IEEE, 2004: 1544-1548. 被引量:1
  • 2KIM J T, PARK H K, PAIK E H. Security issues in peer-to-peer systems [C]// The 7th International Conference on Advanced Communications Technology. Phoenix Park: IEEE, 2005: 1059- 1063. 被引量:1
  • 3SEN S, WANG J. Analyzing peer-to-peer traffic across large networks [J]. IEEE Trans on Networking, 2004, 2(2): 219-231. 被引量:1
  • 4SEN S, SPATSCHECK O, WANG Dong-mei. Accurate, scalable in-network identification of P2P traffic using application signatures [C]// Proceedings of ACM WWW'04. New York: ACM, 2004: 512-521. 被引量:1
  • 5ERMAN J, MAHANTI A, ARLITT M, COHEN I, WILLIAMSON C. Semi-supervised network traffic classification [C]// Proceedings of the ACM SIGMETRICS, New York: ACM, 2007: 369-370. 被引量:1
  • 6ZANDER S, NGUYEN T, ARMITAGE G. Self-learning IP traffic classification based on statistical flow characteristics [J]. Passive and Active Network Measurement, 2005, 3431: 325-328. 被引量:1
  • 7MOORE A W, ZUEV D. Intemet traffic classification using bayesian analysis techniques [C]// Proceedings of the ACM SIGMETRICS. 2005: 50-60. 被引量:1
  • 8MOORE A W, PAPAGIANNAKI K. Towards the accurate identification of network applications [C]// Passive and Active Measurements Workshop. Boston, MA, USA, Springer Press, 2005: 41-54. 被引量:1
  • 9HAFFNER P, SEN S, SPATSCHECK O, WANG Dong-mei. ACAS: automated construction of application signatures [C]// Proceedings of the 2005 ACM SIGCOMM Workshop on Mining Network Data. ACM, New York, 2005: 192-202. 被引量:1
  • 10KARAGIANNIS T, BROIDO A, FALOUTSOS M, CLAFFY K. Transport layer identification of P2P traffic [C]// Proceedings of the 4th ACM SIGCOMM conference on Internet Measurement. Sicily, USA: ACM, 2004: 121-134. 被引量:1

共引文献5

同被引文献9

引证文献2

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部