摘要
首次提出一种基于k-modes聚类算法的混洗差分隐私保护方案(简称SDPk-modes).SDPk-modes根据每个数据之间的距离划分为不同的组,得到足够的细粒度优化效用,采用基于梯度随机扰动技术使计算最优概率耗时更短;在k-modes聚类过程中,通过将数据中频繁出现的特征向量作为聚类中心点,基于属性熵的距离度量方法,加快算法收敛至聚类中心的速度,解决原始算法聚类速度慢、易陷入局部最优等问题,显著提高聚类的效果.实验验证表明,本文提出的方案优于当前同类方案.
This paper proposes for the first time a shuffling differential privacy protection scheme(SDPk-modes)based on k-modes clustering algorithm.SDPk-modes are divided into different groups according to the distance between each data to obtain enough fine-grained optimization effect.The gradient stochastic perturbation technology is used to calculate the optimal probability less time.In the process of k-modes clustering,the feature vector that frequently appears in the data is taken as the cluster center point,and the distance measurement method based on attribute entropy speeds up the algorithm convergence to the cluster center,solves the problems of slow clustering speed and easy to fall into local optimality of the original algorithm,and significantly improves the clustering effect.Experimental verification shows that the proposed scheme is superior to the current similar schemes.
作者
祁富
陈丽敏
QI Fu;CHEN Limin(Mudanjiang Normal University,School of Mathematical Science,Mudanjiang 157011,China;Mudanjiang Normal University,School of Computer and Information Technology,Mudanjiang 157011,China)
出处
《牡丹江师范学院学报(自然科学版)》
2024年第2期6-13,共8页
Journal of Mudanjiang Normal University:Natural Sciences Edition
基金
黑龙江省自然科学基金项目(LH2019F051)
牡丹江师范学院科技创新重点项目(kjcx2023-126mdjnu)。