期刊文献+

一种高效异常检测方法 被引量:7

Approach of Efficient Outlier Detection
下载PDF
导出
摘要 借鉴万有引力思想提出了一种差异性度量方法和度量类偏离程度的方法,以此为基础提出了一种基于聚类的异常检测方法。该异常检测方法关于数据集大小和属性个数具有近似线性时间复杂度,适合于大规模数据集。理论分析以及在真实数据集上的实验结果表明,该方法是有效的,稳健并且实用。 Based on the idea of the law of gravity, the method measuring dissimilarity and the method measuring a cluster departure from the whole are presented. Based on these, an outlier detection approach based on clustering, named EOD, is introduced. The time complexity of the detection approach is nearly linear with the size of dataset and the number of attributes, which results in good scalability and adapts to large dataset. The theoretic analysis and the experimental results on real datasets show that the approach is effective, robust and practicable.
出处 《计算机工程》 CAS CSCD 北大核心 2007年第7期166-168,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60503048 60673191) 广东外语外贸大学基金资助重点项目(GW2005-1-012)
关键词 聚类 异常因子 异常检测 Clustering Outlier factor Outlier detection
  • 相关文献

参考文献9

  • 1Yamanishi K,Takeuchi J.Discovering Outlier Filtering Rules from Unlabeled Data:Combining a Supervised Learner with an Unsupervised Learner[C]//Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2001-08. 被引量:1
  • 2Knorr E M.Outliers and Data Mining:Finding Exceptions in Data[D].Canada:Univesity of British Columbia,2002. 被引量:1
  • 3Breunig M M,Kriegel H P,Ng R T,et al.LOF:Identifying Densitybased Local Outliers[C]//Proceedings of SIGMOD'00,Dallas,Texas.2000:427-438. 被引量:1
  • 4蒋盛益,李庆华,王卉,孟中楼.一种增强的局部异常挖掘方法[J].计算机研究与发展,2005,42(2):210-216. 被引量:8
  • 5He Zengyou,Xu Xiaofei,Deng Shengchun.Discovering Clusterbased Local Outliers[J].Pattern Recognition Letters,2003,24(9/10):1651-1660. 被引量:1
  • 6蒋盛益,李庆华,赵延喜.一种两阶段异常检测方法[J].小型微型计算机系统,2005,26(7):1237-1240. 被引量:7
  • 7Harkins S,He H,Willams G J,et al.Outlier Detection Using Replicator Neural Networks[C]//Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery.2002:170-180. 被引量:1
  • 8蒋盛益,李庆华.一种基于引力的聚类方法[J].计算机应用,2005,25(2):286-288. 被引量:9
  • 9Merz C J,Merphy P.UCI Repository of Machine Learning Databases[Z].1996.http://www.ics.uci.edu/ mlearn/ MLRRepository.html. 被引量:1

二级参考文献25

  • 1GUHA S, RASTOGI R, SHIM K. ROCK: A robust clustering algorithm for categorical attributes[ A]. In proceedings of the 15th ICDE[C], 1999.512-521. 被引量:1
  • 2GANTI V, GEHRKE J, RAMAKRISHNAN R. Cactus- clustering categorical data using summaries[ A]. In Proc 1999 Int Conf Knowledge Discovery and Data Mining[ C], 1999.73 -83. 被引量:1
  • 3GUHA S , MEYERSON A , MISHRA N , et al . Clustering data streams: Theory and practice[ J]. Knowledge and Data Engineering,IEEE Transactions on, 2003, 15(3): 515 -528. 被引量:1
  • 4PORTNOY L, ESKIN L, STOLFO S. Intrusion Detection with Unla-beled Data using Clustering[ A]. In Proceedings of ACM CSS Workshop on Data Mining Applied to Security (DMSA-2001) [ C], Philadelphia, PA, 2001. 被引量:1
  • 5ESKIN E, ARNOLD A, PRERAU M, et al. A geometric framework for unsupervised anomaly detection: Detecting intrusions in unla-beled data[ Z]. In Data Mining for Security Applications, 2002. 被引量:1
  • 6SHENG YJ , YU MX . An Efficient Clustering Algorithm [ A ] . In Proc of 2004 International Conference on Machine Learning and Cybernetics[ C], 2004.8. 被引量:1
  • 7MERZ C J, MERPHY P. UCI repository of machine learning databases[ EB/OL]. http://www. ics. uci. edu/ relearn/ MLRRepository. html, 2000. 被引量:1
  • 8M.M. Breunin, H. P. Kriegel, R. T. Ng, et al. LOF:Identifying density-based local outliers. SIGMOD2000, Dallas,Texas, 2000. 被引量:1
  • 9Jin Wen, K. H. Tung Anthony, Han Jiawei. Mining top-n local outliers in large databases. The 7th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, 2001. 被引量:1
  • 10He Zengyou, Xu Xiaofei, Deng Shengchun. Discovering clusterbased local outliers. Pattern Recognition Letters, 2003, 24 (9-10): 1651~1660. 被引量:1

共引文献21

同被引文献55

引证文献7

二级引证文献37

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部