摘要
大数据时代信息技术不断发展,隐私问题越来越受到人们的关注。尤其是随着移动端的普及,如何在数据发布的同时保护用户个人的隐私信息是当前面临的重大挑战。此前学术界曾提出依赖于可信第三方的中心化差分隐私技术,但在实际应用中可信第三方的条件通常不成立;随后,在中心化差分隐私的基础上进一步提出了本地化差分隐私,它能够防止来自不可信第三方的隐私攻击,并且面对具有任意知识背景的隐私攻击者依然具有很强的防御效果。但是,市场通常不仅要迎合用户的需求,也要满足运营商的要求。为了对两者进行平衡,如何解决运营商的分析任务是亟待解决的问题。RAPPOR(Randomized Aggregatable Privacy-Preserving Ordinal Response)算法能够很好地完成这个任务,它通过使用两次随机响应机制对用户数据进行加密,保证了隐私保护的力度,并使用Lasso回归模型对加密数据进行解密,保证了频率特征提取的准确度。文中的贡献在于将RAPPOR算法应用于疫情信息采集,在保护受访者隐私信息的同时能获取真实的疫情资料,以美国各地新冠确诊人数的数据集进行实验,实验结果表明,所提方法较高程度地拟合了真实结果,完成了频率特征提取的分析任务。RAPPOR算法实现了本地化差分隐私技术从理论走向应用,切实保障了个人的隐私问题。
With the continuous development of information technology in the era of big data,privacy problem has attracted more and more attention.Especially with the increasing popularity of mobile terminals,how to protect users'privacy information while releasing data is a major challenge at present.Previously,academic circle has proposed the center differential privacy technology that relies on a trusted third platform,but the condition that needs a trusted third platform is usually not valid in practical applications.On the basis of center differential privacy,localized differential privacy is further proposed.It can prevent privacy attacks from untrusted third platforms,and it still has a strong defensive effect against privacy attackers with abundant knowledge background.But markets often cater to the needs of service providers as well as users.In order to balance the contradiction between the two,how to accomplish the analysis tasks of service providers is a problem that must be solved.RAPPOR is a good mechanism to accomplish these tasks.It encrypts user data by using two random response mechanisms to ensure the strength of privacy protection.Lasso regression model is used to decrypt the encrypted data to ensure the accuracy of frequency feature extraction.In this paper,RAPPOR algorithm is applied to COVID-19 epidemic information collection,which can obtain real epidemic data while protecting the privacy of respondents.The dataset which includes people diagnosed with COVID-19 in the United States is used to simulate the RAPPOR mechanism and fits the real results to a high degree.RAPPOR algorithm realizes the localized differential privacy technology from theory to application,and effectively protects personal privacy.
作者
黄觉
周春来
HUANG Jue;ZHOU Chun-lai(Department of Information,Renmin University,Beijing 100872,China)
出处
《计算机科学》
CSCD
北大核心
2022年第7期350-356,共7页
Computer Science
基金
国家自然科学基金重点项目(61732006)
国家自然科学基金(61972404,12071478)。
关键词
本地化差分隐私
RAPPOR
频率特征
随机响应
Localized differential privacy
RAPPOR
Frequency characteristics
Random response