In robust regression we often have to decide how many are the unusualobservations, which should be removed from the sample in order to obtain better fitting for the restof the observations. Generally, we use the basic...In robust regression we often have to decide how many are the unusualobservations, which should be removed from the sample in order to obtain better fitting for the restof the observations. Generally, we use the basic principle of LTS, which is to fit the majority ofthe data, identifying as outliers those points that cause the biggest damage to the robust fit.However, in the LTS regression method the choice of default values for high break down-point affectsseriously the efficiency of the estimator. In the proposed approach we introduce penalty cost fordiscarding an outlier, consequently, the best fit for the majority of the data is obtained bydiscarding only catastrophic observations. This penalty cost is based on robust design weights andhigh break down-point residual scale taken from the LTS estimator. The robust estimation is obtainedby solving a convex quadratic mixed integer programming problem, where in the objective functionthe sum of the squared residuals and penalties for discarding observations is minimized. Theproposed mathematical programming formula is suitable for small-sample data. Moreover, we conduct asimulation study to compare other robust estimators with our approach in terms of their efficiencyand robustness.展开更多
文摘In robust regression we often have to decide how many are the unusualobservations, which should be removed from the sample in order to obtain better fitting for the restof the observations. Generally, we use the basic principle of LTS, which is to fit the majority ofthe data, identifying as outliers those points that cause the biggest damage to the robust fit.However, in the LTS regression method the choice of default values for high break down-point affectsseriously the efficiency of the estimator. In the proposed approach we introduce penalty cost fordiscarding an outlier, consequently, the best fit for the majority of the data is obtained bydiscarding only catastrophic observations. This penalty cost is based on robust design weights andhigh break down-point residual scale taken from the LTS estimator. The robust estimation is obtainedby solving a convex quadratic mixed integer programming problem, where in the objective functionthe sum of the squared residuals and penalties for discarding observations is minimized. Theproposed mathematical programming formula is suitable for small-sample data. Moreover, we conduct asimulation study to compare other robust estimators with our approach in terms of their efficiencyand robustness.