摘要
目的探讨基于信息熵的决策树在结核病人住院费用分析中的应用。方法采用基于信息熵的决策树C4.5算法建立结核病人住院费用分析模型。结果决策树C4.5算法从20个变量中筛选出影响病人住院费用的有意义的14个因素并对其重要性进行排序;产生清楚易懂可用于预测的决策规则;建立预测模型,模型分类符合率为:训练集76.58%,验证集77.31%,测试集77.95%。结论决策树C4.5算法建立的模型效果较好,可应用于结核病人住院费用影响因素的分析及费用预测。
Objective To explore the application of decision tree based on entropy in hospitalization expense analysis of patients with tuberculosis. Methods To build up models for hospitalization expense of patients with tuberculosis using decision tree of CA. 5. Results 14 important factors are selected by the model, which have been ranked according to their importance. Readable diagnostic rules are produced and models are built. The accuracy of models are, training set 76.58%, validation set 77. 31%, test set 77.95%. Conclusion The models built by CA. 5 algorithm is satisfactory and can be put into selecting influencing factors of hospitalization expense and predicting the rank of those factors.
出处
《中国医院统计》
2009年第3期223-225,共3页
Chinese Journal of Hospital Statistics
关键词
信息熵
决策树
费用
Entropy Decision tree Hospitalization expense