期刊文献+

基于矩阵填充的大型问卷调查数据缺失插补

Missing data imputation of large-scale questionnaire based on matrix completion
下载PDF
导出
摘要 大型问卷调查不可避免地面临数据缺失问题,调查项目出现无应答和无效应答都会影响数据分析的质量和最终决策的准确性。大型问卷调查数据缺失插补问题可看作矩阵填充(Matrix Completion,MC)问题,利用低秩矩阵恢复技术处理。在不同缺失比例(5%、10%、20%、40%、50%)下,采用基于奇异值阈值算法的MC方法修复缺失数据,并与热卡填充、K-近邻、链式方程多重插补、线性插值等四种常用缺失数据处理方法进行对比。分析结果表明,MC方法在插补准确率、插补误差等方面都具有明显优势,插补效果更好,可为大型问卷调查提供较为可靠的完备数据集。因此,MC方法为大型问卷调查缺失数据处理方法的选择提供借鉴。 Large-scale questionnaire inevitably faces the problem of missing data,and the non-response and invalid response of survey items will also affect the quality of data analysis and the accuracy of final decision.The missing imputation of large-scale questionnaire data can be regarded as a problem shared by Matrix Completion,which is dealt with by low-rank matrix recovery technology.Considering this,this paper used MC method based on singular value threshold algorithm to repair missing data under different missing ratios(5%,10%,20%,40%,50%),and compared this method with four commonly used missing data processing methods such as Hot Deck Imputation,K-Nearest Neighbor,Multivariate Imputation of Chained Equations and Linear Imputation.The results show that MC method has obvious advantages in dealing with the imputation accuracy and error,and it can produce better imputation effect,which can provide a more reliable and complete data set for large-scale questionnaire.Therefore,it concluded that MC method can provide some reference for the selection of missing data processing methods for large-scale questionnaire.
作者 高海燕 李唯欣 牛成英 GAO Hai-yan;LI Wei-xin;NIU Cheng-ying(Lanzhou University of Finance and Economics,Lanzhou 730020,China)
出处 《湖北师范大学学报(自然科学版)》 2023年第3期1-8,共8页 Journal of Hubei Normal University:Natural Science
基金 国家社会科学基金项目(19XTJ002) 甘肃省自然科学基金项目(23JRRA1186) 甘肃省优秀研究生“创新之星”项目(2022CXZX-701)。
关键词 大型问卷调查数据 矩阵填充 缺失数据插补 large-scale questionnaire data matrix completion missing data imputation
  • 相关文献

参考文献8

二级参考文献41

  • 1冯士雍.抽样调查应用与理论中的若干前沿问题[J].统计与信息论坛,2007,22(1):5-13. 被引量:39
  • 2刘桂芬,冯志兰.缺失数据多重估算NORM软件应用[J].数理医药学杂志,2005,18(3):259-262. 被引量:3
  • 3岳勇,田考聪.数据缺失及其填补方法综述[J].预防医学情报杂志,2005,21(6):683-685. 被引量:30
  • 4KishL.,倪加勋主译,孙山泽校译.抽样调查[M].北京:中国统计出版社,1997,527-570. 被引量:2
  • 5Hansen M H, Hurwitz W N. The Problem of Nonresponse in Sample Surveys[J]. Journal of the American Statistical Association, 1946 (41). 被引量:1
  • 6Politz A N, Simmons W tL An Attempt to Get Not--at--home into the Sample Without Call--back[J]. Journal of the American Statistical Association, 1949(44). 被引量:1
  • 7Horvitz D G, Thompson D J. A Generalization of Sampling Without Replacement from a Finite Population[J]. Journal of the American Statistical Association, 1952(47). 被引量:1
  • 8Deming W E, Stephan F F. On a Least Squares Adjustment of a Samples Frequency Table when the Expected Marginal Tables are Known[J]. Annals of Mathematical Statistics, 1940(11). 被引量:1
  • 9Lundstrom M S. Calibration as a Standard Method for Treatment of Nonresponse [D]. Stockholm University, Department of Statistics, 1997. 被引量:1
  • 10Rubin D B. Multiple Imputation for Nonresponse in Survey[M]. New York:John Wiley & Sons, 1987. 被引量:1

共引文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部