摘要
针对上市公司财务舞弊现象严重,传统方法难以识别发现的问题,提出基于随机森林算法的机器学习方法进行识别和辨认.利用网络爬虫爬取的东方财富网上市公司财务数据,建立随机森林模型来识别上市公司的腐败现象.挖掘信用信息,从中提取出有价值的信息并生成新的特征,通过递归特征消除方法保留有意义的特征,并用训练集构建随机森林模型,用测试集评估模型性能.实证结果表明,随机森林模型具有更高的准确率,并对结果进行了分析预测并给出结论.
In view of the serious financial fraud of listed companies,which is difficult to identify by traditional methods,a machine learning method based on random forest algorithm is proposed to distinguish and identify.Using the financial data of listed companies on the Oriental Fortune website crawled by web crawler,a random forest model is established to identify the corruption of listed companies,mine credit information,extract valuable information from it and generate new features.Then,the recursive feature elimination method is used to retain meaningful features.Finally,a random forest model is constructed with training sets,and the model performance is evaluated with test sets.The empirical results show that the random forest model has a higher accuracy.Finally,the results are analyzed and predicted and conclusions are given.
作者
吕艳
LV Yan(Department of Financial,West Anhui University,Lu’an 237000,China)
出处
《西安文理学院学报(自然科学版)》
2023年第3期13-16,21,共5页
Journal of Xi’an University(Natural Science Edition)
关键词
舞弊行为的管理
模型识别
随机森林
management of fraudulent practices
model identification
random forest