Tensor data have been widely used in many fields,e.g.,modern biomedical imaging,chemometrics,and economics,but often suffer from some common issues as in high dimensional statistics.How to find their low-dimensional l...Tensor data have been widely used in many fields,e.g.,modern biomedical imaging,chemometrics,and economics,but often suffer from some common issues as in high dimensional statistics.How to find their low-dimensional latent structure has been of great interest for statisticians.To this end,we develop two efficient tensor sufficient dimension reduction methods based on the sliced average variance estimation(SAVE)to estimate the corresponding dimension reduction subspaces.The first one,entitled tensor sliced average variance estimation(TSAVE),works well when the response is discrete or takes finite values,but is not■consistent for continuous response;the second one,named bias-correction tensor sliced average variance estimation(CTSAVE),is a de-biased version of the TSAVE method.The asymptotic properties of both methods are derived under mild conditions.Simulations and real data examples are also provided to show the superiority of the efficiency of the developed methods.展开更多
In this paper, we propose a new estimate for dimension reduction, called the weighted variance estimate (WVE), which includes Sliced Average Variance Estimate (SAVE) as a special case. Bootstrap method is used to sele...In this paper, we propose a new estimate for dimension reduction, called the weighted variance estimate (WVE), which includes Sliced Average Variance Estimate (SAVE) as a special case. Bootstrap method is used to select the best estimate from the WVE and to estimate the structure dimension. And this selected best estimate usually performs better than the existing methods such as Sliced Inverse Regression (SIR), SAVE, etc. Many methods such as SIR, SAVE, etc. usually put the same weight on each observation to estimate central subspace (CS). By introducing a weight function, WVE puts different weights on different observations according to distance of observations from CS. The weight function makes WVE have very good performance in general and complicated situations, for example, the distribution of regressor deviating severely from elliptical distribution which is the base of many methods, such as SIR, etc. And compared with many existing methods, WVE is insensitive to the distribution of the regressor. The consistency of the WVE is established. Simulations to compare the performances of WVE with other existing methods confirm the advantage of WVE.展开更多
Large dimensional predictors are often introduced in regressions to attenuate the possible modeling bias. We consider the stable direction recovery in single-index models in which we solely assume the response Y is in...Large dimensional predictors are often introduced in regressions to attenuate the possible modeling bias. We consider the stable direction recovery in single-index models in which we solely assume the response Y is independent of the diverging dimensional predictors X when βτ 0 X is given, where β 0 is a p n × 1 vector, and p n →∞ as the sample size n →∞. We first explore sufficient conditions under which the least squares estimation β n0 recovers the direction β 0 consistently even when p n = o(√ n). To enhance the model interpretability by excluding irrelevant predictors in regressions, we suggest an e1-regularization algorithm with a quadratic constraint on magnitude of least squares residuals to search for a sparse estimation of β 0 . Not only can the solution β n of e1-regularization recover β 0 consistently, it also produces sufficiently sparse estimators which enable us to select "important" predictors to facilitate the model interpretation while maintaining the prediction accuracy. Further analysis by simulations and an application to the car price data suggest that our proposed estimation procedures have good finite-sample performance and are computationally efficient.展开更多
High-dimensional data analysis has been a challenging issue in statistics.Sufficient dimension reduction aims to reduce the dimension of the predictors by replacing the original predictors with a minimal set of their ...High-dimensional data analysis has been a challenging issue in statistics.Sufficient dimension reduction aims to reduce the dimension of the predictors by replacing the original predictors with a minimal set of their linear combinations without loss of information.However,the estimated linear combinations generally consist of all of the variables,making it difficult to interpret.To circumvent this difficulty,sparse sufficient dimension reduction methods were proposed to conduct model-free variable selection or screening within the framework of sufficient dimension reduction.Wereview the current literature of sparse sufficient dimension reduction and do some further investigation in this paper.展开更多
We are concerned with partial dimension reduction for the conditional mean function in the presence of controlling variables.We suggest a profile least squares approach to perform partial dimension reduction for a gen...We are concerned with partial dimension reduction for the conditional mean function in the presence of controlling variables.We suggest a profile least squares approach to perform partial dimension reduction for a general class of semi-parametric models.The asymptotic properties of the resulting estimates for the central partial mean subspace and the mean function are provided.In addition,a Wald-type test is proposed to evaluate a linear hypothesis of the central partial mean subspace,and a generalized likelihood ratio test is constructed to check whether the nonparametric mean function has a specific parametric form.These tests can be used to evaluate whether there exist interactions between the covariates and the controlling variables,and if any,in what form.A Bayesian information criterion(BIC)-type criterion is applied to determine the structural dimension of the central partial mean subspace.Its consistency is also established.Numerical studies through simulations and real data examples are conducted to demonstrate the power and utility of the proposed semi-parametric approaches.展开更多
Quantile treatment effects can be important causal estimands in evaluation of biomedical treatments or interventions for health outcomes such as medical cost and utilisation.We consider their estimation in observation...Quantile treatment effects can be important causal estimands in evaluation of biomedical treatments or interventions for health outcomes such as medical cost and utilisation.We consider their estimation in observational studies with many possible covariates under the assumption that treatment and potential outcomes are independent conditional on all covariates.To obtain valid and efficient treatment effect estimators,we replace the set of all covariates by lower dimensional sets for estimation of the quantiles of potential outcomes.These lower dimensional sets are obtained using sufficient dimension reduction tools and are outcome specific.We justify our choice from efficiency point of view.We prove the asymptotic normality of our estimators and our theory is complemented by some simulation results and an application to data from the University of Wisconsin Health Accountable Care Organization.展开更多
Existing estimators of the central mean space are known to have uneven performances across different types of link functions. By combining the strength of the ordinary least squares and the principal Hessian direction...Existing estimators of the central mean space are known to have uneven performances across different types of link functions. By combining the strength of the ordinary least squares and the principal Hessian directions, the authors propose a new hybrid estimator that successfully recovers the central mean space for a wide range of link functions. Based on the new hybrid estimator, the authors further study the order determination procedure and the marginal coordinate test. The superior performance of the hybrid estimator over existing methods is demonstrated in extensive simulation studies.展开更多
基金supported by the National Natural Science Foundation of China(Grant NO.12301377,11971208,92358303)the National Social Science Foundation of China(Grant NO.21&ZD152)+4 种基金the Outstanding Youth Fund Project of the Science and Technology Department of Jiangxi Province(Grant No.20224ACB211003)Jiangxi Provincial National Natural Science Foundation(Grant NO.20232BAB211014)the Science and technology research project of the Education Department of Jiangxi Province(Grant No.GJJ210535)the opening funding of Key Laboratory of Data Science in Finance and Economicsthe innovation team funding of Digital Economy and Industrial Development,Jiangxi University of Finance and Economics。
文摘Tensor data have been widely used in many fields,e.g.,modern biomedical imaging,chemometrics,and economics,but often suffer from some common issues as in high dimensional statistics.How to find their low-dimensional latent structure has been of great interest for statisticians.To this end,we develop two efficient tensor sufficient dimension reduction methods based on the sliced average variance estimation(SAVE)to estimate the corresponding dimension reduction subspaces.The first one,entitled tensor sliced average variance estimation(TSAVE),works well when the response is discrete or takes finite values,but is not■consistent for continuous response;the second one,named bias-correction tensor sliced average variance estimation(CTSAVE),is a de-biased version of the TSAVE method.The asymptotic properties of both methods are derived under mild conditions.Simulations and real data examples are also provided to show the superiority of the efficiency of the developed methods.
基金supported by National Natural Science Foundation of China (Grant No. 10771015)
文摘In this paper, we propose a new estimate for dimension reduction, called the weighted variance estimate (WVE), which includes Sliced Average Variance Estimate (SAVE) as a special case. Bootstrap method is used to select the best estimate from the WVE and to estimate the structure dimension. And this selected best estimate usually performs better than the existing methods such as Sliced Inverse Regression (SIR), SAVE, etc. Many methods such as SIR, SAVE, etc. usually put the same weight on each observation to estimate central subspace (CS). By introducing a weight function, WVE puts different weights on different observations according to distance of observations from CS. The weight function makes WVE have very good performance in general and complicated situations, for example, the distribution of regressor deviating severely from elliptical distribution which is the base of many methods, such as SIR, etc. And compared with many existing methods, WVE is insensitive to the distribution of the regressor. The consistency of the WVE is established. Simulations to compare the performances of WVE with other existing methods confirm the advantage of WVE.
基金supported by National Natural Science Foundation of China (Grant No. 10701035)Chen Guang Project of Shanghai Education Development Foundation (Grant No. 2007CG33)+1 种基金supported by Research Grants Council of Hong KongFaculty Research Grant from Hong Kong Baptist University
文摘Large dimensional predictors are often introduced in regressions to attenuate the possible modeling bias. We consider the stable direction recovery in single-index models in which we solely assume the response Y is independent of the diverging dimensional predictors X when βτ 0 X is given, where β 0 is a p n × 1 vector, and p n →∞ as the sample size n →∞. We first explore sufficient conditions under which the least squares estimation β n0 recovers the direction β 0 consistently even when p n = o(√ n). To enhance the model interpretability by excluding irrelevant predictors in regressions, we suggest an e1-regularization algorithm with a quadratic constraint on magnitude of least squares residuals to search for a sparse estimation of β 0 . Not only can the solution β n of e1-regularization recover β 0 consistently, it also produces sufficiently sparse estimators which enable us to select "important" predictors to facilitate the model interpretation while maintaining the prediction accuracy. Further analysis by simulations and an application to the car price data suggest that our proposed estimation procedures have good finite-sample performance and are computationally efficient.
基金supported by the National Natural Science Foundation of China Grant 11971170the 111 project B14019the Program for Professor of Special Appointment(Eastern Scholar)at Shanghai Institutions of Higher Learning.
文摘High-dimensional data analysis has been a challenging issue in statistics.Sufficient dimension reduction aims to reduce the dimension of the predictors by replacing the original predictors with a minimal set of their linear combinations without loss of information.However,the estimated linear combinations generally consist of all of the variables,making it difficult to interpret.To circumvent this difficulty,sparse sufficient dimension reduction methods were proposed to conduct model-free variable selection or screening within the framework of sufficient dimension reduction.Wereview the current literature of sparse sufficient dimension reduction and do some further investigation in this paper.
基金supported by Humanities and Social Science Foundation of Ministry of Education(Grant No.20YJC910003)Natural Science Foundation of Shanghai(Grant No.20ZR1423000)+1 种基金supported by Natural Science Foundation of Beijing(Grant No.Z19J0002)National Natural Science Foundation of China(Grant Nos.11731011 and 11931014)。
文摘We are concerned with partial dimension reduction for the conditional mean function in the presence of controlling variables.We suggest a profile least squares approach to perform partial dimension reduction for a general class of semi-parametric models.The asymptotic properties of the resulting estimates for the central partial mean subspace and the mean function are provided.In addition,a Wald-type test is proposed to evaluate a linear hypothesis of the central partial mean subspace,and a generalized likelihood ratio test is constructed to check whether the nonparametric mean function has a specific parametric form.These tests can be used to evaluate whether there exist interactions between the covariates and the controlling variables,and if any,in what form.A Bayesian information criterion(BIC)-type criterion is applied to determine the structural dimension of the central partial mean subspace.Its consistency is also established.Numerical studies through simulations and real data examples are conducted to demonstrate the power and utility of the proposed semi-parametric approaches.
基金supported by the National Natural Science Foundation of China(11871287,11831008)the Natural Science Foundation of Tianjin(18JCYBJC41100)+3 种基金the Fundamental Research Funds for the Central Universitiesthe Key Laboratory for Medical Data Analysis and Statistical Research of Tianjin,the Chinese 111 Project(B14019)the U.S.National Science Foundation(DMS-1612873 and DMS-1914411partially supported through a Patient-Centered Outcomes Research Institute(PCORI)Award(ME-1409-21219).
文摘Quantile treatment effects can be important causal estimands in evaluation of biomedical treatments or interventions for health outcomes such as medical cost and utilisation.We consider their estimation in observational studies with many possible covariates under the assumption that treatment and potential outcomes are independent conditional on all covariates.To obtain valid and efficient treatment effect estimators,we replace the set of all covariates by lower dimensional sets for estimation of the quantiles of potential outcomes.These lower dimensional sets are obtained using sufficient dimension reduction tools and are outcome specific.We justify our choice from efficiency point of view.We prove the asymptotic normality of our estimators and our theory is complemented by some simulation results and an application to data from the University of Wisconsin Health Accountable Care Organization.
文摘Existing estimators of the central mean space are known to have uneven performances across different types of link functions. By combining the strength of the ordinary least squares and the principal Hessian directions, the authors propose a new hybrid estimator that successfully recovers the central mean space for a wide range of link functions. Based on the new hybrid estimator, the authors further study the order determination procedure and the marginal coordinate test. The superior performance of the hybrid estimator over existing methods is demonstrated in extensive simulation studies.