期刊文献+

基于聚类和压缩感知的高维数据发布算法 被引量:1

High Dimensional Data Publishing Algorithm Based on Clustering and Compressed Sensing
下载PDF
导出
摘要 针对现有高维数据发布机制中因“维度灾难”加入噪音过多,导致数据可用性低的问题,提出一种基于聚类和压缩感知的高维数据发布算法PrivCACS。根据属性敏感度对属性集进行聚类处理,利用互信息作为属性关联度将依赖度强的非敏感属性加至敏感属性集中,将属性集分为非敏感属性集C 1和敏感属性集C 2,进而得到对应的数据子集D 1和D 2。经过压缩感知,将会泄露隐私信息的数据子集D 2降维转变为低维概要进行差分隐私加噪,通过改进正交匹配追踪算法重构出合成数据集,与非敏感数据集D 1合并后进行发布。在真实数据集上的实验结果表明,所提出的PrivCACS算法在SVM分类上优于传统的PrivBayes和Jtree算法,在保障隐私的前提下,数据的可用性更高。 Aim at the problem of low availability of data caused by the"dimension disaster",which meant too much noise was added in the existing high-dimensional data publishing mechanism,a high-dimensional data publishing algorithm PrivCACS based on clustering and compressed sensing was proposed.The attribute set was clustered according to the attribute sensitivity,the non-sensitive attributes with strong dependence were added to the sensitive attribute set by using mutual information as the correlation degree of attributes.The attribute set was divided into non-sensitive attribute set C 1 and sensitive attribute set C 2,and then the corresponding data subset D 1 and D 2 were obtained.The data subset D 2 that leaked private information was reduced in dimensionality and transformed into low-dimensional profile for differential privacy and noise addition after compressed sensing.Synthetic data set was reconstructed by improving orthogonal matching pursuit algorithm and integrated with non-sensitive data set D 1 for publishing.Experimental results on real data sets showed that the proposed PrivCACS algorithm was superior to the traditional PrivBayes and Jtree algorithms in SVM classification,and the data availability was higher on the premise of privacy protection.
作者 刘振鹏 陈杰 王仕磊 郭超 李小菲 LIU Zhenpeng;CHEN Jie;WANG Shilei;GUO Chao;LI Xiaofei(School of Electronic Information Engineering,Hebei University,Baoding 071002,China;Information Technology Center,Hebei University,Baoding 071002,China)
出处 《郑州大学学报(理学版)》 CAS 北大核心 2023年第2期63-69,共7页 Journal of Zhengzhou University:Natural Science Edition
基金 教育部云数融合科教创新基金项目(2017A20004) 河北省自然科学基金项目(F2019201427)。
关键词 高维数据 属性聚类 压缩感知 差分隐私 改进正交匹配追踪 high-dimensional data attribute clustering compressed sensing differential privacy improving orthogonal matching pursuit
  • 相关文献

参考文献5

二级参考文献8

共引文献152

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部