期刊文献+

基于局部密度的快速离群点检测算法 被引量:26

Fast outlier detection algorithm based on local density
下载PDF
导出
摘要 已有的密度离群点检测算法LOF不能适应数据分布异常情况离群点检测,INFLO算法虽引入反向k近邻点集有效地解决了数据分布异常情况的离群点检测问题,但存在需要对所有数据点不加区分地分析其k近邻和反向k近邻点集导致的效率降低问题。针对该问题,提出局部密度离群点检测算法——LDBO,引入强k近邻点和弱k近邻点概念,通过分析邻近数据点的离群相关性,对数据点区别对待;并提出数据点离群性预判断策略,尽可能避免不必要的反向k近邻分析,有效提高数据分布异常情况离群点检测算法的效率。理论分析和实验结果表明,LDBO算法效率优于INFLO,算法是有效可行的。 Mining outliers is to find exceptional objects that deviate from the most rest of the data set. Outlier detection based on density has attracted lots of attention, but the density-based algorithm named Local Outlier Factor (LOF) is not suitable for the data set with abnormal distribution, and the algorithm named INFLuenced Outlierness (INFLO) solves this problem by analyzing both k nearest neighbors and reverse k nearest neighbors of each data point at cost of inferior efficiency. To solve this problem, a local density-based algorithm named Local Density Based Outlier detection (LDBO) was proposed, which can improve outlier detection efficiency and effectiveness simultaneously. LDBO introduced definitions of strong k nearest neighbors and weak k nearest neighbors to realize outlier relation analysis of those data points located nearby. Furthermore, to improve the outlier detection efficiency, prejudgement was applied to avoid unnecessary reverse k nearest neighbor analysis as far as possible. Theoretical analysis and experimental results Indicate that LDBO outperforms INFLO in efficiency, and it is effective and feasible.
出处 《计算机应用》 CSCD 北大核心 2017年第10期2932-2937,共6页 journal of Computer Applications
基金 国家自然科学基金资助项目(61370077)~~
关键词 离群点检测 局部密度 强k近邻点 弱k近邻点 反向k近邻点集 outlier detection local density strong k nearest neighbors weak k nearest neighbors Reverse k NearestNeighbors (RkNN)
  • 相关文献

参考文献5

二级参考文献52

  • 1孙焕良,鲍玉斌,于戈,赵法信,王大玲.一种基于划分的孤立点检测算法[J].软件学报,2006,17(5):1009-1016. 被引量:16
  • 2薛安荣,鞠时光,何伟华,陈伟鹤.局部离群点挖掘算法研究[J].计算机学报,2007,30(8):1455-1463. 被引量:96
  • 3Breunig M M,Kriegel H P,Ng R T,et al.LOF:Identifying density-based local outliers[C]//Proc of ACM SIGMOD Conf.New York:ACM,2000:427-438. 被引量:1
  • 4Tang J,Chen Z,Fu A,et al.Enhancing effectiveness of outlier detections for low-density patterns[C]//Proc of Advances in Knowledge Discovery and Data Mining 6th Pacific Asia Conf.Berlin:Springer,2002:535-548. 被引量:1
  • 5Papadimitirou S,Kitagawa H,Gibbons P B,et al.LOCI:Fast outlier detection using the local correlation integral[C]//Proc of the 19th Int Conf on Data Engineering.Los Alamitos:IEEE Computer Society,2003:315-326. 被引量:1
  • 6Sanjay C,Pei Sun.SLOM:A new measure for local spatial outliers[J].Knowledge and Information Systems,2006,9(4):412-429. 被引量:1
  • 7Barnett V,Lewis T.Outliers in Statistical Data[M].New York:John Wiley and Sons,1994. 被引量:1
  • 8Johnson T,Kwok I,Ng R T.Fast computation of 2-dimensional depth contours[C]//Proc of the 4th Int Conf on Knowledge Discovery and Data Mining (KDD'98).New York:ACM,1998:224-228. 被引量:1
  • 9Knorr E M,Ng R T.Algorithms for mining distance-based outliers in large datasets[C]//Proc of the 24th Int Conf on Very Large Data Bases.New York:ACM,1998:392-403. 被引量:1
  • 10Ramaswamy S,Rastogi R,Shim K.Efficient algorithms for mining outliers from large data sets[C]//Proc of the 2000 ACM SIGMOD Int Conf on Management of Data.New York:ACM,2000:93-104. 被引量:1

共引文献91

同被引文献208

引证文献26

二级引证文献80

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部