摘要
在管理决策中,管理对象的真实状态往往因隐私、敏感等因素导致自我报告数据质量不高,样本数据存在较大偏差,进而难以掌握目标对象的真实情况.针对这一问题,同时为了满足数字经济时代下的数据隐私保护需求,本文开发了一类基于社交网络间接报告的数据采集方法,并在网络抽样与统计推断理论的基础上,设计了基于间接报告样本数据的总体估计方法(ECM).该方法操作简单,可对调查对象进行随机采样或实施普查,除采集样本的自我陈述数据外,同时采集每个样本关于其密切社交对象的报告数据,从而避免了自身因敏感原因等不愿提供数据或提供不真实数据的问题,提出的估计方法能在样本报告数据的基础上实现对总体的高精度估计,并能实现自报告数据和他报告数据的交互验证.本文的研究方法在一个多达556627名活跃用户的难接触人群在线社交网络上进行了充分验证,抽样实验表明ECM对全网平均好友数和总体特征的估计误差低于3%.进一步地,本文开展了实证研究,通过设计自报告和他报告问卷,对某企业职员的一般和隐私性问题进行了问卷调查,并通过间接估计方法实现了对目标的总体估计,展示了该方法的实用性和有效性.
In the process of management decision-making,the real state of the management objects is often subject to low-quality self-reported data or large sampling biases due to concerns regarding privacy or sensitivity,which makes it difficult to know the real situation of the target objects.To solve this problem,yet to meet the data privacy protection demand in the era of the digital economy,this paper develops a data collection method based on social network indirect reports,and designs an ego-centric sampling method(ECM)based on indirectly reported sample data on the basis of network sampling and statistical inference theory.This method is simple to implement such that it can be deployed by either randomly sampling the survey objects or conducting a census.In addition to collecting the self-reported data of the samples,it also collects data of each sample's close social contacts,so as to avoid the problem that people are unwilling to provide some data or provide untrue data due to sensitive privacry reasons.The proposed method can achieve a high-precision estimation of the population,and it can realize the interactive verification of self-reported data and cross-reported data.The research method is fully validated on the online social network of a hard-to-reach population with up to 556627 active users.The sampling experiment shows that the estimation bias of ECM is less than 3%for the average number of friends and overall characteristics of the whole network.Furthermore,this paper conducts an empirical study by implementing a questionnaire survey on general and sensitive variables for employees in an enterprise,and derives the overall estimation of the study objects through the indirect estimation method;the results verify the practicality and effectiveness of ECM.
作者
吕欣
刘楚楚
蔡梦思
陈洒然
LU Xin;LIU Chu-chu;CAI Meng-si;CHEN Sa-ran(College of Systems Engineering,National University of Defense Technology,Changsha 410073,China;School of Computing,National University of Singapore,Singapore 117417,Singapore;State Key Laboratory on Blind Signal Processing,Chengdu 610041,China)
出处
《管理科学学报》
CSCD
北大核心
2023年第5期103-120,共18页
Journal of Management Sciences in China
基金
国家自然科学基金资助重大研究计划项目(91846301)
国家杰出青年科学基金资助项目(72025405)
国家社会科学基金资助重大项目(22ZDA102)。
关键词
隐私
数据真实性
网络抽样
间接估计
统计推断
privacy
data authenticity
network sampling
indirect estimation
statistical inference