摘要
针对搜索日志的发布泄露用户隐私的问题将差分隐私引入到搜索日志中,提出了一种满足ε-差分隐私的隐私保护策略算法,使得搜索日志中的隐私信息不被泄露.采用前缀树的思想对数据源预处理和剪枝,在所得结果中利用拉普拉斯机制添加噪声扰动真实结果,并通过理论证明该方法满足差分隐私保护.实验采用多机联机处理策略,大量缩短算法处理时间.通过实验结果分析,选取合适剪枝k阈值,使得发布数据在隐私保护度和数据准确率中达到平衡.
The released search log may reveal the user' s privacy. This paper utilizes the differential privacy and prefix tree to protect the privacy of search log,and proposes an algorithm that satisfied ε-differential privacy to protect the privacy of search log. This paper adopts the idea of prefix tree to preprocess the data source. According to the built prefix tree,frequent items are found. Non-frequent items are pruned by setting different threshold values k. Laplace mechanism of differential privacy is used to protect the query results.Algorithms are used to satisfy the differential privacy protection by the theoretical proof. Experiments adopt the distributed processing strategy to reduce processing time greatly. Performance is evaluated to verify the feasibility of the algorithm. By selecting the appropriate threshold k,the Privacy Protection Level and Date Accurate Probability are balanced.
出处
《小型微型计算机系统》
CSCD
北大核心
2016年第3期540-544,共5页
Journal of Chinese Computer Systems
基金
上海市教育委员会科研创新项目(12YZ095)资助
教育部归国留学人员科研启动基金
关键词
搜索日志
隐私保护
差分隐私
前缀树
拉普拉斯机制
search log
privacy protection
differential privacy
prefix tree
Laplace mechanism