摘要
离群点检测是数据挖掘和机器学习领域重要的研究方向之一,其目的是识别与其他样本表现显著不同的样本。本文提出了一种基于模糊邻域熵的多粒度离群点检测方法。首先,将模糊相似性引入邻域熵和相对熵,提出模糊邻域熵和相对模糊邻域熵的不确定性度量。其次,分析了模糊邻域熵和相对模糊邻域熵在逻辑和几何上的差异特性。最后,结合理想解法(TOPSIS)和多粒度序列提出了新的样本离群程度评判标准TFMME-OF(TOPSIS and Fuzzy Multigranulation Mixed Entropy-based Outlier Factor)。实验结果表明,该方法相较于其它同类方法有更好的离群点检测效果。
Outlier detection is one of the important research directions in the field of data mining and machine learning. It aims to identify samples that are significantly different from other samples. In this paper, a multigranularity outlier detection method based on fuzzy neighborhood entropy is proposed. Firstly, fuzzy similarity is introduced into neighborhood entropy and relative entropy, and the uncertainty measures of fuzzy neighborhood entropy and fuzzy relative entropy are proposed. Secondly, the characteristics of fuzzy neighborhood entropy and fuzzy relative entropy in logical and geometric representation are analyzed. Finally, a new criterion TFMME-OF(TOPSIS and fuzzy multigranulation mixed entropy-based outlier factor) is proposed by combining TOPSIS and multigranularity sequence. Experimental results show that the proposed method achieves better outlier detection results in comparison with other representative methods.
作者
汪贝琪
周杰
高灿
WANG Bei-qi;ZHOU Jie;GAO Can(College of Computer Science and Software Engineering,Shenzhen University,Shenzhen 518060,China;Key Laboratory of Intelligent Information Processing(Shenzhen University),Guangdong Province,Shenzhen 518060,China)
出处
《模糊系统与数学》
北大核心
2022年第6期102-113,共12页
Fuzzy Systems and Mathematics
基金
国家自然科学基金资助项目(61806127,62076164)。
关键词
离群点检测
邻域熵
模糊邻域熵
理想解法
多粒度序列
Outlier Detection
Neighborhood Entropy
Fuzzy Neighborhood Entropy
TOPSIS
Multigranularity Sequence