摘要
针对现有的局部离群点检测算法对数据对象不加分区,致使计算复杂度高的问题,提出了一种基于偏离的局部离群点检测算法。该算法首先对数据集进行分区,将可能存在的局部离群点与其紧邻的簇划分到一个数据块中,然后在每个数据块内,根据离散系数刻画各个数据对象的偏离度,从而求得每个数据对象在其所属的数据块内的局部偏离因子,发现可能存在的局部离群点。理论分析和实验结果表明,该算法具有良好的识别局部离群点的能力,检测的准确率和时间效率均优于经典的LOF算法。
Aiming at the problem that existing local outlier detection algorithm does not perform partition of data objects, which results in high computational complexity, a deviation-based local outlier detection algorithm is introduced. The algorithm first divides the data set into sections, puts the potential outliers and their near clusters into a local neighbourhood, then in each local neighbourhood the local de- viation factor of each data object is described with the variation coefficient, as a result, the local variation of each data object in its be- longed data block is obtained, and the potential local outliers can more likely be found. The theoretic analysis and experiment results in- dicate that the proposed method has good local outlier recognition ability, the accuracy and time efficiency of local outlier detection are better than those of classical local outlier foitor(LOF) algorithm.
出处
《仪器仪表学报》
EI
CAS
CSCD
北大核心
2014年第10期2293-2298,共6页
Chinese Journal of Scientific Instrument
基金
国家自然科学基金(61272029)
李尚大集美大学学科建设基金(ZC2011018)资助项目
关键词
聚类
局部离群点检测
局部偏离因子
离散系数
clustering
local outlier detection
local deviation factor
variation coefficient