期刊文献+

基于机器学习的公交驾驶员事故风险识别及影响因素研究 被引量:2

Research on accident risk identification and influencing factors of bus drivers based on machine learning
下载PDF
导出
摘要 为从公交驾驶员群体中识别出易发生事故的风险公交驾驶员,结合某市公交公司营运安全管理系统数据库、百度应用程序接口(API)及网络爬取技术,并应用K近邻算法补充缺失值,获取42条线路及1893名驾驶员的数据;基于驾驶员、车辆、线路特征、违规行为、事故、管理等基本特征变量构造派生变量;采用包括递归特征消除、有惩罚项的逻辑回归、随机森林的集成方法选择特征;采用极致梯度提升(XGBoost)等6种机器方法分别建立分类模型,并采用贝叶斯方法优化超参数。结果表明:在构建的6个分类模型中,XGBoost方法构建的模型其受试者工作特征(ROC)曲线下的面积(AUC)评估结果最佳;运用贝叶斯方法优化模型,可以在一定程度上提升ROC的AUC指标;对于风险公交驾驶员预测准确率达到98.66%,运营单位还可以根据自身情况权衡虚报率与命中率代价。此外,车辆服役时间、违规次数等特征对于事故风险具有明显的非线性影响。 In order to identify the bus drivers who are about to cause accidents,the data set was obtained by combining the Bus Safety Management System database,the Baidu application programming interface(API)and web crawling technology.K-Nearest Neighbor algorithm was used to supplement the missing values and data from 1893 drivers in 42 lines was obtained.The basic characteristic variables included driver,vehicle,route characteristics,violations,accidents,management,and further construct derived characteristics on this basis.An integrated method,including recursive feature elimination,logistic regression with penalty terms,random forest and others,was designed and used for feature selection.The model was built using 6 machine methods like XGBoost and optimized for the hyper-parameters using Bayesian methods.The results indicate that among the six classification models constructed,the model constructed by XGBoost method has the best area under receiver operating characteristic(ROC)area under curve(AUC)evaluation results.Bayesian optimization can improve the AUC of ROC to a certain extent.For the accident driver's prediction accuracy rate reaches 98.66%,the operating unit can also weigh the false positive rate and true positive rate according to its own situation.Moreover,the nonlinear influence effect of features is found in the model results.The characteristics of vehicle service time,driving age,violations,punishment and other characteristics have a very obvious role in the accident risk.
作者 朱彤 秦丹 魏雯 任杰 冯移冬 ZHU Tong;QIN Dan;WEI Wen;REN Jie;FENG Yidong(College of Transportation Engineering,Chang'an University,Xi'an Shaanxi 710064,China;Research Institute of Highway Ministry of Transport,Beijing 100088,China)
出处 《中国安全科学学报》 CAS CSCD 北大核心 2023年第2期23-30,共8页 China Safety Science Journal
基金 国家重点研发计划(2019YFE0108000) 陕西省交通运输厅科研项目(21-34R)。
关键词 风险公交驾驶员 机器学习 事故风险 极致梯度提升(XGBoost) SHapley加性解释(SHAP)值 risky bus drivers machine learning accident risk extreme gradient boosting(XGBoost) SHapley additive explanation(SHAP)value
  • 相关文献

参考文献4

二级参考文献98

  • 1严利鑫,吴超仲,高嵩,马晓凤,侯萍.驾驶人个体因素对驾驶愤怒情绪影响关系研究[J].交通信息与安全,2013,31(6):119-124. 被引量:22
  • 2Labrinidis A, Jagadish H V. Challenges and Opportunities with Big Data. Proc of the VLDB Endowment, 2012, 5(12) : 2032-2033. 被引量:1
  • 3Bizer C, Boncz P, Brodie M L, et al. The Meaningful Use of Big Data : Four Perspectives-Four Challenges. ACM SIGMOD Record, 2012, 40(4) : 56-60. 被引量:1
  • 4Wang F Y. A Big-Data Perspective on AI: Newton, Merton, and An- alytics Intelligence. IEEE Intelligent Systems, 2012, 27 (5) : 2-4. 被引量:1
  • 5Simon H A. Why Should Machines Learn?//Michalski R S, Car- bonell J G, Mitchell T M, et al. , eds. Machine Learning: An Arti- ficial Intelligence Approach. Berlin, Germany: Springer, 1983: 25 -37. 被引量:1
  • 6Hart P. The Condensed Nearest Neighbor Rule. IEEE Trans on In- formation Theory, 1968, 14(3) : 515-516. 被引量:1
  • 7Gates G. The Reduced Nearest Neighbor Rule. IEEE Trans on In- formation Theory, 1972, 18(3) : 431-433. 被引量:1
  • 8Brighton H, Mellish C. Advances in Instance Selection for Instance- Based Learning Algorithms. Data Mining and Knowledge Discovery, 2002, 6(2) : 153-172. 被引量:1
  • 9Li Y H, Maguire L. Selecting Critical Patterns Based on Local Geo- metrical and Statistical Information. IEEE Trans on Pattern Analysis and Machine Intelligence, 2011, 33(6) : 1189-1201. 被引量:1
  • 10Angiulli F. Fast Nearest Neighbor Condensation for Large Data Sets Classification. IEEE Trans on Knowledge and Data Engineering, 2007, 19(11): 1450-1464. 被引量:1

共引文献374

同被引文献7

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部