期刊文献+

基于角度分布的高维数据流异常点检测算法 被引量:6

High-Dimensional Data Stream Outlier Detection Algorithm Based on Angle Distribution
下载PDF
导出
摘要 为了有效检测高维数据流中的异常点,提出一种基于角度分布的高维数据流异常点检测(DSOD)算法.运用基于角度分布的方法准确识别高维数据集中的正常点、边界点以及异常点;构造了基于正常集、边界集的小规模数据流型计算集,以降低算法在空间以及时间上的开销;建立了正常集、边界集的更新机制,以解决大数据流的概念转移问题.在真实数据集上的实验结果表明,所提出的DSOD算法的效率高于Simple VOA算法与ABOD算法,并且适用于大数据流上的异常点检测. To improve outlier detection in high-dimensional data stream, a novel high-dimensional data stream outlier detection (DSOD) algorithm based on angle distribution was proposed. To identify the nor- mal point, border point and outlier accurately, the method of angle distribution-based outlier detection al- gorithm was employed. To reduce the computational complexity, a small-scale calculation set of data stream was established, which is composed of normal set, border set. To solve the problem of concept drift, an updated mechanism for the normal set and border set was developed. The experimental results on real data sets demonstrate that DSOD is more efficient than Simple variance of angles (Simple VOA) and angel-based outlier detection (ABOD) and is very suitable for the outlier detection of large data streams.
出处 《上海交通大学学报》 EI CAS CSCD 北大核心 2014年第5期647-652,共6页 Journal of Shanghai Jiaotong University
基金 国家自然科学基金资助项目(11247325) 重庆市科委自然科学基金资助项目(CSTC2013yykfC60005 CSTC2011BB4145 CSTC2013jcsf-jcssX0022 CSTC2013jcyjjq60002)
关键词 角度分布 数据流 高维 异常点检测 angle distribution data stream high-dimensional outlier detection
  • 相关文献

参考文献10

  • 1Angiulli F, Fassetti F. Distances-based outlier que-ries in data stream: the novel task and algorithms[j].Data Mining and Knowledge Discovery, 2010,20 (2):290-324. 被引量:1
  • 2Joel W, Giannella C. In-network outlier detection inwireless sensor networks [J]. Knowledge and Infor-mation Systemf 2013,34 (1): 23-54. 被引量:1
  • 3夏英,刘申艺.实时异常轨迹检测方法及其应用[J].重庆邮电大学学报(自然科学版),2011,23(4):496-499. 被引量:3
  • 4Zhang Y,Hamm N, Meratnia N, et al. Statistics-based outlier detection for wireless sensor networks[J]. International Journal of Geographical InformationScience, 2012,26 (8): 1373-1392. 被引量:1
  • 5Angiulli F,Fassetti F. DOLPHIN : An efficient algo-rithm for mining distances-based outliers in very largedatasets [J]. ACM Transactions on Knowledge Discov-ery for Data (TKDD),2009,3(2) : 1-57. 被引量:1
  • 6Breunig M,Kriegel H,Ng R,et al. LOF : Identif-ying density-based local outliers [C]// Proceeding ofthe 2000 ACM SIGMOD International Conference onManagement of Data. New York, USA: ACM,2000: 93-104. 被引量:1
  • 7王柯柯,崔贯勋,倪伟,苟光磊.基于单元的快速的大数据集离群数据挖掘算法[J].重庆邮电大学学报(自然科学版),2010,22(5):673-677. 被引量:7
  • 8Kriegel H P,Schubert M, Zimek A. Angle-basedoutlier detection in high dimensional data [C]// Pro-ceedings of the 14th ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining.New York, USA: ACM, 2008: 444-452. 被引量:1
  • 9Pham N, Pagh R. A near-linear time approximationalgorithm for angle-based outlier detection in high di-mensional data [C]// Proceedings of the 18th ACMSIGKDD International Conference on Knowledge Dis-covery and Data Mining. New York, USA': ACM,2012: 877-885. 被引量:1
  • 10Hawkins D. Identification of outliers [M]. London:Chapman and Hall, 1980 : 1-188. 被引量:1

二级参考文献15

共引文献8

同被引文献53

引证文献6

二级引证文献51

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部