摘要
本文使用Logistic模型对台湾客户是否违约支付建立预测模型,通过这个模型可以在银行给客户贷款时判断客户是否会违约。首先,由于数据中有23个变量,其中有些变量并不显著,遂采用最优子集的方法判断出模型最优的变量个数为8。再通过Forward Stepwise Selection方法选择出8个变量并对此建立Logistic模型。通过将数据分为训练集和测试集来得到模型的精准度:模型整体预测准确率为80.2%,总体精度还算可以,模型对客户不违约的预测还是非常准确,但对客户违约的预测非常不理想。同时,采用另一种可视化的方法衡量模型的优劣,即ROC曲线,计算出AUC的值为0.66。模型的结果优于我们随机猜测,具有预测价值。
In this paper, the Logistic model is used to establish a prediction model for the default payment of Taiwan customers. Through this model, the bank can judge whether the customer will default when lending to the customer. First of all, since there are 23 variables in the data, some of which are not significant, the optimal number of variables in the model is judged to be 8 by the optimal subset method. Then the Forward Stepwise Selection method selects 8 variables and establishes the Logistic model. The accuracy of the model was obtained by dividing the data into training set and test set: the overall prediction accuracy of the model was 80.2%, and the overall accuracy was reasonable. The prediction of non-default by the model was still very accurate, but the prediction of default by the customer was very unsatisfactory. At the same time, another visual method was used to measure the merits of the model, namely the ROC curve, and the value of AUC was calculated as 0.66. The results of the model are better than our random guesses and have predictive value.
出处
《理论数学》
2023年第5期1315-1320,共6页
Pure Mathematics