摘要
目的:通过对干眼患者临床数据进行数据挖掘分析建立干眼的预测模型。方法:收集2020-03/2021-01于我院就诊的干眼患者218例436眼纳入干眼组,健康体检无干眼人群212例424眼纳入正常对照组。两组受试者均接受泪液分泌试验(SⅠt)、荧光素染色泪膜破裂时间(FBUT)、非接触式泪膜破裂时间(NI-BUT)、泪河高度(TMH)、角膜荧光素钠染色(FL)、睑板腺功能评分(MG-SCORE)检查。分别从干眼组和正常对照组随机抽取100例200眼数据组成测试集,其余干眼组118例236眼和正常对照组112例224眼数据作为训练集。采用CFS(correlation feature searching)特征筛选算法筛选与干眼检测有关的强相关影响因素,分别采用C4.5决策树、Rondom Forest、Rondom Tree、Naive Bayes、KNN、SVM、Decision Stump和Bagging机器学习方法构建干眼预测模型,并对其进行单因子变量分析。结果:通过CFS特征筛选算法得到SⅠt、NI-BUT、TMH和FL评分4个指标作为特征变量。基于该4个特征变量,采用8种机器学习算法构建模型的总预测准确率均高于75%,其中Random Forest模型的预测准确率最高,对干眼组和正常对照组的预测准确率分别达91.8%和88.3%,总预测准确率达90.1%。单因素建模分析结果表明,FL评分和NI-BUT是干眼预测准确率较高的两个变量,均超过74%。结论:Random Forest算法可以用来构建泛化能力强、稳定性好的干眼预测模型,NI-BUT和FL与干眼有较强相关性,可以考虑将此二项作为临床检验是否患有干眼的数据标准。
AIM:To build prediction model of dry eye with data mining techniques.METHODS:From March 2020 to January 2021,218 patients(436 eyes)with dry eye were selected as dry eye group,and 212 patients(424 eyes)without dry eye were selected as control group.SchirmerⅠtest(SⅠt),fluorescein staining tear film break-up time(FBUT),non-contact tear film break-up time(NI-BUT),tear meniscus height(TMH),corneal fluorescein staining(FL)and meibomian gland function score(MG-SCORE)were performed in both groups.Totally 200 eyes of 100 samples were randomly selected from the dry eye group and the control group to form a test set of 400 eyes of 200 samples.The remaining 118 samples(236 eyes)in the dry eye group and 112 samples(224 eyes)in the control group were used as the training set.Correlation feature searching(CFS)feature selection algorithm was used to search the factors related to the detection of dry eye.C4.5,Random Forest,Rondom Tree,Naive Bayes,KNN,SVM,Decision Stump and Bagging methods were used to construct the prediction model,respectively.RESULTS:By using CFS feature selection algorithm,an optimal sub-feature set including SⅠt,NI-BUT,TMH and FL were obtained.Based on the four features,eight machine learning algorithms were employed to build the prediction model,respectively.The results show that the prediction accuracies were all higher than 75%.Among the eight prediction models,the prediction accuracy model by using Random Forest is the highest,which achieved 91.8%and 88.3%,respectively.And the total prediction accuracy reached 90.1%.In addition,through the analysis of single factor modeling,we found that FL and NI-BUT had the highest prediction accuracy,which exceeded 74%.CONCLUSION:Random Forest could be considered as a stable and well generalization algorithm to build prediction model for dry eye with well generalization.NI-BUT and FL have a strong correlation with dry eye,which can be considered as the standard for clinical examination of dry eye.
作者
张弛
王萍
苏佳山
程冬梅
Chi Zhang;Ping Wang;Jia-Shan Su;Dong-Mei Cheng(Huaxia Eye Hospital of Foshan, Huaxia Eye Hospital Group, Foshan 528000, Guangdong Province, China)
出处
《国际眼科杂志》
CAS
北大核心
2021年第9期1644-1648,共5页
International Eye Science
基金
广东省医学科研基金项目(No.A2020406)
华厦转化医学青年基金项目(No.2017-D-001)。
关键词
干眼
机器学习
预测
特征筛选
模型
dry eye
machine learning
prediction
feature selection
model