摘要
财务欺诈不仅会导致会计信息失真,还会危害经济的健康发展。因此,找到一种高效的智能化欺诈识别方法具有重要的现实意义。本文基于2020—2022年美国上市公司提交到EDGAR数据库的年度报告,聚焦于报告中管理层讨论与分析部分的文本信息(Management Discussion and Analysis,MD&A)并对其进行分析。考虑到现有数据中欺诈和非欺诈样本数据极度不平衡的特点,本文在分层注意力网络的基础上设计了一个更高效的财务欺诈识别模型,最终使得欺诈识别模型的F1分数和F2分数分别提高了4.1%和3.7%,所提出的算法框架能够有效提高非平衡MD&A文本数据集的分类正确率。研究结果为财务欺诈识别系统性能的提高以及其他领域长文本分类任务的预测提供了新的解决思路,并进一步验证了使用MD&A文本数据进行财务欺诈识别的有效性,为使用非平衡数据进行欺诈识别提供了直接的实证支持。
Financial fraud will not only lead to the distortion of accounting information,but also endanger the healthy development of the economy.Therefore,finding an efficient and intelli-gent fraud identification method is of great practical significance.On the basis of the annual reports submitted by American listed companies to the EDGAR database from 2020 to 2022,the article fo-cuses on the textual information in the management discussion and analysis(Management Discussion and Analysis,MD&A)of the reports and explores it.Considering the extreme imbalance between fraudulent and non-fraudulent sample data in existing data,a more efficient financial fraud recogni-tion model is designed based on a hierarchical attention network.The result shows that the F1 and F2 scores of the fraud recognition model increased by 4.1%and 3.7%,respectively.The proposed algorithm framework can effectively improve the classification accuracy of unbalanced MD&A text datasets.The study provides a new solution for the improvement of the performance of real financial fraud recognition systems as well as the prediction of long text classification tasks in other fields,verifies the effectiveness of using MD&A text data for financial fraud identification,and provides di-rect empirical support for using imbalanced data for fraud identification.
作者
程双双
谷晓燕
王兴芬
CHENG Shuangshuang;GU Xiaoyan;WANG Xingfen(School of Information Management,Beijing Information Science and Technology University,Beijing 102206)
出处
《管理现代化》
北大核心
2024年第1期121-127,共7页
Modernization of Management
基金
国家自然科学基金“复杂产品系统研发过程风险累积的网络演算度量方法研究”(项目编号:71701020)。
关键词
财务欺诈识别
管理层讨论与分析
分层注意力网络
非平衡文本数据
Financial fraud identification
Management discussion and analysis
Hierarchical attention network
Unbalanced text data