摘要
<div style="text-align:justify;"> With the high speed development of information technology, contemporary data from a variety of fields becomes extremely large. The number of features in many datasets is well above the sample size and is called high dimensional data. In statistics, variable selection approaches are required to extract the efficacious information from high dimensional data. The most popular approach is to add a penalty function coupled with a tuning parameter to the log likelihood function, which is called penalized likelihood method. However, almost all of penalized likelihood approaches only consider noise accumulation and supurious correlation whereas ignoring the endogeneity which also appeared frequently in high dimensional space. In this paper, we explore the cause of endogeneity and its influence on penalized likelihood approaches. Simulations based on five classical pe-nalized approaches are provided to vindicate their inconsistency under endogeneity. The results show that the positive selection rate of all five approaches increased gradually but the false selection rate does not consistently decrease when endogenous variables exist, that is, they do not satisfy the selection consistency. </div>
<div style="text-align:justify;"> With the high speed development of information technology, contemporary data from a variety of fields becomes extremely large. The number of features in many datasets is well above the sample size and is called high dimensional data. In statistics, variable selection approaches are required to extract the efficacious information from high dimensional data. The most popular approach is to add a penalty function coupled with a tuning parameter to the log likelihood function, which is called penalized likelihood method. However, almost all of penalized likelihood approaches only consider noise accumulation and supurious correlation whereas ignoring the endogeneity which also appeared frequently in high dimensional space. In this paper, we explore the cause of endogeneity and its influence on penalized likelihood approaches. Simulations based on five classical pe-nalized approaches are provided to vindicate their inconsistency under endogeneity. The results show that the positive selection rate of all five approaches increased gradually but the false selection rate does not consistently decrease when endogenous variables exist, that is, they do not satisfy the selection consistency. </div>
作者
Yawei He
Yawei He(Department of Mathematics and Statistics, Chongqing Jiaotong University, Chongqing, China)