摘要
目的 基于机器学习方法中的随机森林和决策树模型,实现在医疗与健康应用场景下的恶意流量检测。方法 以CICIDS2017样本集作为模型的训练集与验证集,对将该样本集通过Python预处理后的共1708979条数据进行模型训练。预处理后的样本集中训练集占比80%(1367183条),验证集占比20%(341795条),在sklearn中进行随机森林和决策树模型参数调整训练,再将在医疗与健康应用场景下捕获到的500条网络流量作为测试集进行模型泛化能力评估。结果 由决策树和随机森林混淆矩阵图可知,决策树模型对于慢速拒绝服务攻击以及跨站脚本攻击的预测准确率为95%,尤其是决策树模型对慢速拒绝服务攻击进行预测时,会将其与跨站脚本攻击混淆。随机森林模型对于慢速拒绝服务攻击预测准确率为99%,能够正确预测大多数慢速拒绝服务攻击。随机森林模型在医疗与健康应用场景下整体表现良好。结论 两种模型对于在医疗与健康应用场景下的恶意流量检测准确率效果较好,但传统的决策树模型准确率低于随机森林模型。随机森林模型更适合在医疗健康场景下的恶意流量检测,可为医疗健康应用场景中的网络安全研究提供参考。
Objective To realize the malicious traffic detection in medical and health application scenarios,the random forest and decision tree model in machine learning method were used.Methods CIC-ISD2017 sample set were used as the training and validation set for the model.A total of 1708979 pieces of data were pre-processed in Python for model training.The pre-processed sample set accounted for 80%of the training set(1367183 pieces)and 20%of the validation set(341795 pieces),and was trained by adjusting parameters of random forest and decision tree models on sklearn.Finally,500 network traffic captured in the built medical and health application scenarios were used as the test set to evaluate the model generalization ability.Results From the decision tree and random forest confusion matrix,the prediction accuracy of decision tree model for slow denial-of-service attacks and cross-site scripting attacks was 95%,especially when decision tree model predicted slow denial-of-service attacks,it was confused with cross-site scripting attacks.Random forest model predicted slow denial-of-service attacks with 99%accuracy,could correctly predict most slow denial-of-service attacks.The random forest model performs well in medical and health application scenarios.Conclusion The two models achieve ideal results for malicious traffic detection accuracy in medical and health application scenarios,but the accuracy of the traditional decision tree model is lower than that of the random forest model.The random forest model is more suitable for malicious traffic detection in medical and health scenarios,and can provide some reference for future network security research in medical and health application scenarios.
作者
高健云
刘颖颖
戴依蓝
李澍
GAO Jianyun;LIU Yingying;DAI Yilan;LI Shu(Institute of Medical Device Control,China Academy of Food and Drug Control,Beijing 102629,China;School of Medical Devices,Shenyang Pharmaceutical University,Shenyang Liaoning 117004,China;Beijing Medical and Health Technology Development Center,Beijing 100035,China)
出处
《中国医疗设备》
2024年第1期12-17,共6页
China Medical Devices
基金
国家重点研发计划(2020YFC2007104)。
关键词
医疗健康应用场景
机器学习
决策树
随机森林
网络安全
medical and health application scenarios
machine learning
decision tree
random forest
network security