In this paper,the authors investigate three aspects of statistical inference for the partially linear regression models where some covariates are measured with errors.Firstly, a bandwidth selection procedure is propos...In this paper,the authors investigate three aspects of statistical inference for the partially linear regression models where some covariates are measured with errors.Firstly, a bandwidth selection procedure is proposed,which is a combination of the differencebased technique and GCV method.Secondly,a goodness-of-fit test procedure is proposed, which is an extension of the generalized likelihood technique.Thirdly,a variable selection procedure for the parametric part is provided based on the nonconcave penalization and corrected profile least squares.Same as"Variable selection via nonconcave penalized likelihood and its oracle properties"(J.Amer.Statist.Assoc.,96,2001,1348-1360),it is shown that the resulting estimator has an oracle property with a proper choice of regularization parameters and penalty function.Simulation studies are conducted to illustrate the finite sample performances of the proposed procedures.展开更多
In this paper,we develop a flexible semiparametric model averaging marginal regression procedure to forecast the joint conditional quantile function of the response variable for ultrahighdimensional data.First,we appr...In this paper,we develop a flexible semiparametric model averaging marginal regression procedure to forecast the joint conditional quantile function of the response variable for ultrahighdimensional data.First,we approximate the joint conditional quantile function by a weighted average of one-dimensional marginal conditional quantile functions that have varying coefficient structures.Then,a local linear regression technique is employed to derive the consistent estimates of marginal conditional quantile functions.Second,based on estimated marginal conditional quantile functions,we estimate and select the significant model weights involved in the approximation by a nonconvex penalized quantile regression.Under some relaxed conditions,we establish the asymptotic properties for the nonparametric kernel estimators and oracle estimators of the model averaging weights.We further derive the oracle property for the proposed nonconvex penalized model averaging procedure.Finally,simulation studies and a real data analysis are conducted to illustrate the merits of our proposed model averaging method.展开更多
In this paper,we focus on the partially linear varying-coefficient quantile regression with missing observations under ultra-high dimension,where the missing observations include either responses or covariates or the ...In this paper,we focus on the partially linear varying-coefficient quantile regression with missing observations under ultra-high dimension,where the missing observations include either responses or covariates or the responses and part of the covariates are missing at random,and the ultra-high dimension implies that the dimension of parameter is much larger than sample size.Based on the B-spline method for the varying coefficient functions,we study the consistency of the oracle estimator which is obtained only using active covariates whose coefficients are nonzero.At the same time,we discuss the asymptotic normality of the oracle estimator for the linear parameter.Note that the active covariates are unknown in practice,non-convex penalized estimator is investigated for simultaneous variable selection and estimation,whose oracle property is also established.Finite sample behavior of the proposed methods is investigated via simulations and real data analysis.展开更多
In this paper, based on spline approximation, the authors propose a unified variable selection approach for single-index model via adaptive L1 penalty. The calculation methods of the proposed estimators are given on t...In this paper, based on spline approximation, the authors propose a unified variable selection approach for single-index model via adaptive L1 penalty. The calculation methods of the proposed estimators are given on the basis of the known lars algorithm. Under some regular conditions, the authors demonstrate the asymptotic properties of the proposed estimators and the oracle properties of adaptive LASSO(aL ASSO) variable selection. Simulations are used to investigate the performances of the proposed estimator and illustrate that it is effective for simultaneous variable selection as well as estimation of the single-index models.展开更多
When there are outliers or heavy-tailed distributions in the data, the traditional least squares with penalty function is no longer applicable. In addition, with the rapid development of science and technology, a lot ...When there are outliers or heavy-tailed distributions in the data, the traditional least squares with penalty function is no longer applicable. In addition, with the rapid development of science and technology, a lot of data, enjoying high dimension, strong correlation and redundancy, has been generated in real life. So it is necessary to find an effective variable selection method for dealing with collinearity based on the robust method. This paper proposes a penalized M-estimation method based on standard error adjusted adaptive elastic-net, which uses M-estimators and the corresponding standard errors as weights. The consistency and asymptotic normality of this method are proved theoretically. For the regularization in high-dimensional space, the authors use the multi-step adaptive elastic-net to reduce the dimension to a relatively large scale which is less than the sample size, and then use the proposed method to select variables and estimate parameters. Finally, the authors carry out simulation studies and two real data analysis to examine the finite sample performance of the proposed method. The results show that the proposed method has some advantages over other commonly used methods.展开更多
The seamless-L0 (SELO) penalty is a smooth function on [0, ∞) that very closely resembles the L0 penalty, which has been demonstrated theoretically and practically to be effective in nonconvex penalization for var...The seamless-L0 (SELO) penalty is a smooth function on [0, ∞) that very closely resembles the L0 penalty, which has been demonstrated theoretically and practically to be effective in nonconvex penalization for variable selection. In this paper, we first generalize SELO to a class of penalties retaining good features of SELO, and then propose variable selection and estimation in linear models using the proposed generalized SELO (GSELO) penalized least squares (PLS) approach. We show that the GSELO-PLS procedure possesses the oracle property and consistently selects the true model under some regularity conditions in the presence of a diverging number of variables. The entire path of GSELO-PLS estimates can be efficiently computed through a smoothing quasi-Newton (SQN) method. A modified BIC coupled with a continuation strategy is developed to select the optimal tuning parameter. Simulation studies and analysis of a clinical data are carried out to evaluate the finite sample performance of the proposed method. In addition, numerical experiments involving simulation studies and analysis of a microarray data are also conducted for GSELO-PLS in the high-dimensional settings.展开更多
We consider the problem of variable selection for single-index varying-coefficient model, and present a regularized variable selection procedure by combining basis function approximations with SCAD penalty. The propos...We consider the problem of variable selection for single-index varying-coefficient model, and present a regularized variable selection procedure by combining basis function approximations with SCAD penalty. The proposed procedure simultaneously selects significant covariates with functional coefficients and local significant variables with parametric coefficients. With appropriate selection of the tuning parameters, the consistency of the variable selection procedure and the oracle property of the estimators are established. The proposed method can naturally be applied to deal with pure single-index model and varying-coefficient model. Finite sample performances of the proposed method are illustrated by a simulation study and the real data analysis.展开更多
In this puper, we consider the problem of variabie selection and model detection in varying coefficient models with longitudinM data. We propose a combined penalization procedure to select the significant variables, d...In this puper, we consider the problem of variabie selection and model detection in varying coefficient models with longitudinM data. We propose a combined penalization procedure to select the significant variables, detect the true structure of the model and estimate the unknown regression coefficients simultaneously. With appropriate selection of the tuning parameters, we show that the proposed procedure is consistent in both variable selection and the separation of varying and constant coefficients, and the penalized estimators have the oracle property. Finite sample performances of the proposed method are illustrated by some simulation studies and the real data analysis.展开更多
In statistics and machine learning communities, the last fifteen years have witnessed a surge of high-dimensional models backed by penalized methods and other state-of-the-art variable selection techniques.The high-di...In statistics and machine learning communities, the last fifteen years have witnessed a surge of high-dimensional models backed by penalized methods and other state-of-the-art variable selection techniques.The high-dimensional models we refer to differ from conventional models in that the number of all parameters p and number of significant parameters s are both allowed to grow with the sample size T. When the field-specific knowledge is preliminary and in view of recent and potential affluence of data from genetics, finance and on-line social networks, etc., such(s, T, p)-triply diverging models enjoy ultimate flexibility in terms of modeling, and they can be used as a data-guided first step of investigation. However, model selection consistency and other theoretical properties were addressed only for independent data, leaving time series largely uncovered. On a simple linear regression model endowed with a weakly dependent sequence, this paper applies a penalized least squares(PLS) approach. Under regularity conditions, we show sign consistency, derive finite sample bound with high probability for estimation error, and prove that PLS estimate is consistent in L_2 norm with rate (s log s/T)~1/2.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.11971291)the National Social Science Foundation of China(Grant No.19BTJ032)+1 种基金Fujian Alliance of Mathematics(Grant No.2023SXLMMS10)Scientific Research Climbing Program of Xiamen University of Technology(Grant No.XPDKT20037).
文摘In this paper,the authors investigate three aspects of statistical inference for the partially linear regression models where some covariates are measured with errors.Firstly, a bandwidth selection procedure is proposed,which is a combination of the differencebased technique and GCV method.Secondly,a goodness-of-fit test procedure is proposed, which is an extension of the generalized likelihood technique.Thirdly,a variable selection procedure for the parametric part is provided based on the nonconcave penalization and corrected profile least squares.Same as"Variable selection via nonconcave penalized likelihood and its oracle properties"(J.Amer.Statist.Assoc.,96,2001,1348-1360),it is shown that the resulting estimator has an oracle property with a proper choice of regularization parameters and penalty function.Simulation studies are conducted to illustrate the finite sample performances of the proposed procedures.
基金Supported by the National Natural Science Foundation of China Grant(Grant No.12201091)Natural Science Foundation of Chongqing Grant(Grant Nos.CSTB2022NSCQ-MSX0852,cstc2021jcyj-msxmX0502)+3 种基金Innovation Support Program for Chongqing Overseas Returnees(Grant No.cx2020025)Science and Technology Research Program of Chongqing Municipal Education Commission(Grant Nos.KJQN202100526,KJQN201900511)the National Statistical Science Research Program(Grant No.2022LY019)Chongqing University Innovation Research Group Project:Nonlinear Optimization Method and Its Application(Grant No.CXQT20014)。
文摘In this paper,we develop a flexible semiparametric model averaging marginal regression procedure to forecast the joint conditional quantile function of the response variable for ultrahighdimensional data.First,we approximate the joint conditional quantile function by a weighted average of one-dimensional marginal conditional quantile functions that have varying coefficient structures.Then,a local linear regression technique is employed to derive the consistent estimates of marginal conditional quantile functions.Second,based on estimated marginal conditional quantile functions,we estimate and select the significant model weights involved in the approximation by a nonconvex penalized quantile regression.Under some relaxed conditions,we establish the asymptotic properties for the nonparametric kernel estimators and oracle estimators of the model averaging weights.We further derive the oracle property for the proposed nonconvex penalized model averaging procedure.Finally,simulation studies and a real data analysis are conducted to illustrate the merits of our proposed model averaging method.
基金Supported by National Natural Science Foundation of China(Grant No.12071348)Fundamental Research Funds for Central Universities,China(Grant No.2023-3-2D-04)。
文摘In this paper,we focus on the partially linear varying-coefficient quantile regression with missing observations under ultra-high dimension,where the missing observations include either responses or covariates or the responses and part of the covariates are missing at random,and the ultra-high dimension implies that the dimension of parameter is much larger than sample size.Based on the B-spline method for the varying coefficient functions,we study the consistency of the oracle estimator which is obtained only using active covariates whose coefficients are nonzero.At the same time,we discuss the asymptotic normality of the oracle estimator for the linear parameter.Note that the active covariates are unknown in practice,non-convex penalized estimator is investigated for simultaneous variable selection and estimation,whose oracle property is also established.Finite sample behavior of the proposed methods is investigated via simulations and real data analysis.
基金supported by the National Natural Science Foundation of China under Grant No.61272041
文摘In this paper, based on spline approximation, the authors propose a unified variable selection approach for single-index model via adaptive L1 penalty. The calculation methods of the proposed estimators are given on the basis of the known lars algorithm. Under some regular conditions, the authors demonstrate the asymptotic properties of the proposed estimators and the oracle properties of adaptive LASSO(aL ASSO) variable selection. Simulations are used to investigate the performances of the proposed estimator and illustrate that it is effective for simultaneous variable selection as well as estimation of the single-index models.
基金supported by the National Natural Science Foundation of China under Grant Nos.12271294,12171225 and 12071248.
文摘When there are outliers or heavy-tailed distributions in the data, the traditional least squares with penalty function is no longer applicable. In addition, with the rapid development of science and technology, a lot of data, enjoying high dimension, strong correlation and redundancy, has been generated in real life. So it is necessary to find an effective variable selection method for dealing with collinearity based on the robust method. This paper proposes a penalized M-estimation method based on standard error adjusted adaptive elastic-net, which uses M-estimators and the corresponding standard errors as weights. The consistency and asymptotic normality of this method are proved theoretically. For the regularization in high-dimensional space, the authors use the multi-step adaptive elastic-net to reduce the dimension to a relatively large scale which is less than the sample size, and then use the proposed method to select variables and estimate parameters. Finally, the authors carry out simulation studies and two real data analysis to examine the finite sample performance of the proposed method. The results show that the proposed method has some advantages over other commonly used methods.
基金Supported by the National Natural Science Foundation of China(11501578,11501579,11701571,41572315)the Fundamental Research Funds for the Central Universities(CUGW150809)
文摘The seamless-L0 (SELO) penalty is a smooth function on [0, ∞) that very closely resembles the L0 penalty, which has been demonstrated theoretically and practically to be effective in nonconvex penalization for variable selection. In this paper, we first generalize SELO to a class of penalties retaining good features of SELO, and then propose variable selection and estimation in linear models using the proposed generalized SELO (GSELO) penalized least squares (PLS) approach. We show that the GSELO-PLS procedure possesses the oracle property and consistently selects the true model under some regularity conditions in the presence of a diverging number of variables. The entire path of GSELO-PLS estimates can be efficiently computed through a smoothing quasi-Newton (SQN) method. A modified BIC coupled with a continuation strategy is developed to select the optimal tuning parameter. Simulation studies and analysis of a clinical data are carried out to evaluate the finite sample performance of the proposed method. In addition, numerical experiments involving simulation studies and analysis of a microarray data are also conducted for GSELO-PLS in the high-dimensional settings.
文摘We consider the problem of variable selection for single-index varying-coefficient model, and present a regularized variable selection procedure by combining basis function approximations with SCAD penalty. The proposed procedure simultaneously selects significant covariates with functional coefficients and local significant variables with parametric coefficients. With appropriate selection of the tuning parameters, the consistency of the variable selection procedure and the oracle property of the estimators are established. The proposed method can naturally be applied to deal with pure single-index model and varying-coefficient model. Finite sample performances of the proposed method are illustrated by a simulation study and the real data analysis.
基金Supported by National Natural Science Foundation of China(Grant Nos.11501522,11101014,11001118 and11171012)National Statistical Research Projects(Grant No.2014LZ45)+2 种基金the Doctoral Fund of Innovation of Beijing University of Technologythe Science and Technology Project of the Faculty Adviser of Excellent PhD Degree Thesis of Beijing(Grant No.20111000503)the Beijing Municipal Education Commission Foundation(Grant No.KM201110005029)
文摘In this puper, we consider the problem of variabie selection and model detection in varying coefficient models with longitudinM data. We propose a combined penalization procedure to select the significant variables, detect the true structure of the model and estimate the unknown regression coefficients simultaneously. With appropriate selection of the tuning parameters, we show that the proposed procedure is consistent in both variable selection and the separation of varying and constant coefficients, and the penalized estimators have the oracle property. Finite sample performances of the proposed method are illustrated by some simulation studies and the real data analysis.
基金supported by Natural Science Foundation of USA (Grant Nos. DMS1206464 and DMS1613338)National Institutes of Health of USA (Grant Nos. R01GM072611, R01GM100474 and R01GM120507)
文摘In statistics and machine learning communities, the last fifteen years have witnessed a surge of high-dimensional models backed by penalized methods and other state-of-the-art variable selection techniques.The high-dimensional models we refer to differ from conventional models in that the number of all parameters p and number of significant parameters s are both allowed to grow with the sample size T. When the field-specific knowledge is preliminary and in view of recent and potential affluence of data from genetics, finance and on-line social networks, etc., such(s, T, p)-triply diverging models enjoy ultimate flexibility in terms of modeling, and they can be used as a data-guided first step of investigation. However, model selection consistency and other theoretical properties were addressed only for independent data, leaving time series largely uncovered. On a simple linear regression model endowed with a weakly dependent sequence, this paper applies a penalized least squares(PLS) approach. Under regularity conditions, we show sign consistency, derive finite sample bound with high probability for estimation error, and prove that PLS estimate is consistent in L_2 norm with rate (s log s/T)~1/2.