摘要
在高斯-马尔可夫假设条件下,最小二乘估计是具有最小方差的最优线性无偏估计量,因此基于最小二乘估计的时空地理加权回归方法在满足此假设时可以获得最优估计,但现实中这些条件有时得不到满足。如果样本数据中存在异常值或者呈厚尾分布,最小二乘回归模型估计值可能会存在较大偏误。而分位数回归受异常值影响较小,相比最小二乘回归更为稳健且应用条件相对更为宽松。更为重要的是最小二乘回归模型只能探索解释变量对响应变量条件均值的影响,而分位数回归可以探索解释变量对响应变量分布的影响(如响应变量的多个分位数),可以挖掘到更为丰富的信息。本文在局部多项式估计原理基础上,提出了基于多带宽局部多项式的时空地理加权分位数回归模型,利用两步迭代估计方法得到系数估计,并且允许不同自变量(影响因素)的最优带宽可以不同。本文通过数值模拟,将该模型与时空地理加权最小二乘回归进行对比,基于分位数回归的系数估计的均方误差和平均绝对误差均比最小二乘估计量小(例如,在0.75分位数,基于最小二乘回归得到的系数估计的均方误差和平均绝对误差分别是基于分位数回归的10倍和4倍),说明本文的分位数回归具有稳健性且可以探索影响响应变量分布的因素。最后以上海市2017—2021年商品房住宅小区为案例对象,应用该方法,探究不同影响因素对不同分位数的住宅价格(如高位房价、中等房价、低位房价)的影响,说明了本文方法的实用性。实际数据研究表明同一个影响因素对不同水平房价的影响效果不同,即同一影响因素系数的时间分布和空间分布在高位房价、中等房价和低位房价存在明显差异,并且不同影响因素的最优带宽也存在差异;与基于最小二乘回归的MGTWR相比,本文的分位数回归模型对于异常值的存在更为稳健(�
The geographically and temporally weighted regression method based on weighted least squares estimation achieves optimal estimates under the assumption of Gauss-Markov independent identical distributions.However,these conditions cannot be always satisfied.If there are outliers or heavy-tailed distributions in the data,the least squares estimates may be significantly biased.On the other hand,quantile regression is less affected by outliers and is more robust than least squares regression,which can be applied in a broader range of applications under more relaxed conditions.More importantly,the least squares regression model only focuses on the mean of the response,while quantile regression explores the global distribution of the response variable(e.g.,quantiles of the response variable)and can obtain richer information.In this paper,we propose the geographically and temporally weighted quantile regression model based on the local polynomial estimation.This model allows for different optimal bandwidths for different explanatory variables and use a two-step estimation method to obtain the estimates of the coefficients.To illustrate the superiority of the proposed method,we compare the proposed method with the geographically and temporally weighted least squares regression through numerical simulations.The simulation results show that the mean square error and the mean absolute error of the coefficient estimates for the proposed quantile regression model are both smaller than those of the least squares regression model.For example,at the 0.75 quantile,the mean square error and mean absolute error of the coefficient estimates based on the least squares regression are 10 times and 4 times those based on the quantile regression,respectively.This indicates that our proposed method is robust and can explore the global distribution of the response variable compared to the least squares regression model.Finally,to illustrate the practical ability of the method,we apply it to the data of Shanghai's commercial residential neig
作者
王守芬
王守霞
顾建祥
WANG Shoufen;WANG Shouxia;GU Jianxiang(Shanghai Surveying and Mapping Institute,Shanghai 200063,China;School of Mathematical Sciences,Peking University,Beijing 100871,China;Key Laboratory of Spatial-temporal Big Data Analysis and Application of Natural Resources in Megacities,Ministry of Natural Resources,Shanghai 200063,China)
出处
《地球信息科学学报》
EI
CSCD
北大核心
2024年第3期567-590,共24页
Journal of Geo-information Science
基金
上海市2021年度“科技创新行动计划”社会发展科技攻关项目(21DZ1204100)。
关键词
时空地理加权
多带宽
局部多项式
分位数回归
稳健
全局分布
异常值
异方差
geographically and temporally weighted regression
multi-bandwidths
local polynomials estimate
quantile regression
robustness
global distribution
outlier
heteroscedasticity