Objective Clinical medical record data associated with hepatitis B-related acute-on-chronic liver failure(HBV-ACLF)generally have small sample sizes and a class imbalance.However,most machine learning models are desig...Objective Clinical medical record data associated with hepatitis B-related acute-on-chronic liver failure(HBV-ACLF)generally have small sample sizes and a class imbalance.However,most machine learning models are designed based on balanced data and lack interpretability.This study aimed to propose a traditional Chinese medicine(TCM)diagnostic model for HBV-ACLF based on the TCM syndrome differentiation and treatment theory,which is clinically interpretable and highly accurate.Methods We collected medical records from 261 patients diagnosed with HBV-ACLF,including three syndromes:Yang jaundice(214 cases),Yang-Yin jaundice(41 cases),and Yin jaundice(6 cases).To avoid overfitting of the machine learning model,we excluded the cases of Yin jaundice.After data standardization and cleaning,we obtained 255 relevant medical records of Yang jaundice and Yang-Yin jaundice.To address the class imbalance issue,we employed the oversampling method and five machine learning methods,including logistic regression(LR),support vector machine(SVM),decision tree(DT),random forest(RF),and extreme gradient boosting(XGBoost)to construct the syndrome diagnosis models.This study used precision,F1 score,the area under the receiver operating characteristic(ROC)curve(AUC),and accuracy as model evaluation metrics.The model with the best classification performance was selected to extract the diagnostic rule,and its clinical significance was thoroughly analyzed.Furthermore,we proposed a novel multiple-round stable rule extraction(MRSRE)method to obtain a stable rule set of features that can exhibit the model’s clinical interpretability.Results The precision of the five machine learning models built using oversampled balanced data exceeded 0.90.Among these models,the accuracy of RF classification of syndrome types was 0.92,and the mean F1 scores of the two categories of Yang jaundice and Yang-Yin jaundice were 0.93 and 0.94,respectively.Additionally,the AUC was 0.98.The extraction rules of the RF syndrome differentiation model based on the MRSRE展开更多
基金Key research project of Hunan Provincial Administration of Traditional Chinese Medicine(A2023048)Key Research Foundation of Education Bureau of Hunan Province,China(23A0273).
文摘Objective Clinical medical record data associated with hepatitis B-related acute-on-chronic liver failure(HBV-ACLF)generally have small sample sizes and a class imbalance.However,most machine learning models are designed based on balanced data and lack interpretability.This study aimed to propose a traditional Chinese medicine(TCM)diagnostic model for HBV-ACLF based on the TCM syndrome differentiation and treatment theory,which is clinically interpretable and highly accurate.Methods We collected medical records from 261 patients diagnosed with HBV-ACLF,including three syndromes:Yang jaundice(214 cases),Yang-Yin jaundice(41 cases),and Yin jaundice(6 cases).To avoid overfitting of the machine learning model,we excluded the cases of Yin jaundice.After data standardization and cleaning,we obtained 255 relevant medical records of Yang jaundice and Yang-Yin jaundice.To address the class imbalance issue,we employed the oversampling method and five machine learning methods,including logistic regression(LR),support vector machine(SVM),decision tree(DT),random forest(RF),and extreme gradient boosting(XGBoost)to construct the syndrome diagnosis models.This study used precision,F1 score,the area under the receiver operating characteristic(ROC)curve(AUC),and accuracy as model evaluation metrics.The model with the best classification performance was selected to extract the diagnostic rule,and its clinical significance was thoroughly analyzed.Furthermore,we proposed a novel multiple-round stable rule extraction(MRSRE)method to obtain a stable rule set of features that can exhibit the model’s clinical interpretability.Results The precision of the five machine learning models built using oversampled balanced data exceeded 0.90.Among these models,the accuracy of RF classification of syndrome types was 0.92,and the mean F1 scores of the two categories of Yang jaundice and Yang-Yin jaundice were 0.93 and 0.94,respectively.Additionally,the AUC was 0.98.The extraction rules of the RF syndrome differentiation model based on the MRSRE