摘要
[目的/意义]为了合理化决策,通常一个政府部门会根据业务需求向其他部门共享某类数据,为本部门管理或服务决策提供辅助参考依据。数据共享在其中至关重要,但若在没有适当预防措施的情况下就共享政务数据,将容易造成隐私信息的泄露。[方法/过程]针对政府部门间共享统计数据的场景,提出一种基于本地化差分隐私的政务数据共享方法。该方法在算法Generalized randomized response(GRR)的基础上引入数据分箱思想,通过等宽分箱将数据记录分入更小的数据域范围内,以克服当前隐私保护算法在数据域较大且数据量较少时统计误差大的问题。[结果/结论]将所提算法与GRR算法在仿真数据集和真实数据集上均进行了对比分析,实验结果表明该算法可有效降低统计误差,并能在不同分布和数据域大小下保持其效用性。
[Purpose/Significance]In order to rationalize the decision-making,usually a government department will share some kind of data with other departments according to the business needs,and provide auxiliary reference for the management or service decision-making of the department.Data sharing is of utmost importance.However,if government data is shared without proper precautions,private information could easily be leaked.[Method/Process]Aiming at the scenario of sharing statistical data among government departments,a government data sharing approach based on local differential privacy is proposed.This approach introduces the idea of data binning based on the Generalized Randomized Response(GRR)algorithm.Equi-width binning is adopted to divide the data into smaller data domains to overcome the problem of large statistical errors in the current privacy protection algorithms when the data domain size is large and the amount of data is small.[Result/Conclusion]We compared the proposed algorithm with the GRR on both the simulation datasets and a real dataset.Experimental results show that the proposed algorithm can effectively reduce statistical errors and maintain its utility under different distributions and data domain sizes.
作者
郝玉蓉
朴春慧
颜嘉麒
蒋学红
Hao Yurong;Piao Chunhui;Yan Jiaqi;Jiang Xuehong(Shijiazhuang Tiedao University,Shijiazhuang 050043;Laboratory for Electromagnetic Environmental Effects and Information Processing,Shijiazhuang 050043;Nanjing University,Nanjing 210023;Department of Housing&Urban-Rural Development of Hebei Province,Shijiazhuang 050051)
出处
《情报杂志》
CSSCI
北大核心
2021年第2期169-175,137,共8页
Journal of Intelligence
基金
河北省教育厅在读研究生创新能力培养资助项目“基于Hyperledger的隐私保护政务数据共享研究”(编号:No.CXZZSS2020071)研究成果之一。
关键词
政府数据共享
本地化差分隐私
数据分箱
隐私保护算法
government data sharing
local differential privacy
data binning
privacy protection algorithm