摘要
针对司法领域标记数据匮乏、标注质量不高、存在强逻辑性导致裁判文书量刑情节识别效果不佳的问题,提出一种基于反绎学习的量刑情节识别模型ABL-CON。首先结合神经网络与领域逻辑推理,通过半监督学习方法,使用置信学习方法表征情节识别置信度;然后修正无标签数据经过神经网络产生的不合逻辑的错误情节,重新训练识别模型,以提高识别精度。在自构建的司法数据集上的实验结果表明,使用50%标注数据与50%无标注数据的ABL-CON模型在Macro_F1值和Micro_F1值上分别达到了90.35%和90.58%,优于同样条件下的BERT和SS-ABL,也超越了使用100%标注数据的BERT模型。ABL-CON模型通过逻辑反绎修正不符合逻辑的标签能够有效提高标签的逻辑合理性以及标签的识别能力。
Aiming at the problem of poor recognition of sentencing circumstances in adjudication documents caused by the lack of labeled data,low quality of labeling and existence of strong logicality in judicial field,a sentencing circumstance recognition model based on abductive learning named ABL-CON(ABductive Learning in CONfidence)was proposed.Firstly,combining with neural network and domain logic inference,through the semi-supervised method,a confidence learning method was used to characterize the confidence of circumstance recognition.Then,the illogical error circumstances generated by neural network of the unlabeled data were corrected,and the recognition model was retrained to improve the recognition accuracy.Experimental results on the self-constructed judicial dataset show that the ABL-CON model using 50%labeled data and 50%unlabeled data achieves 90.35%and 90.58%in Macro_F1 and Micro_F1,respectively,which is better than BERT(Bidirectional Encoder Representations from Transformers)and SS-ABL(Semi-Supervised ABductive Learning)under the same conditions,and also surpasses the BERT model using 100%labeled data.The ABL-CON model can effectively improve the logical rationality of labels as well as the recognition ability of labels by correcting illogical labels through logical abductive correctness.
作者
李锦烨
黄瑞章
秦永彬
陈艳平
田小瑜
LI Jinye;HUANG Ruizhang;QIN Yongbin;CHEN Yanping;TIAN Xiaoyu(College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China;State Key Laboratory of Public Big Data(Guizhou University),Guiyang Guizhou 550025,China)
出处
《计算机应用》
CSCD
北大核心
2022年第6期1802-1807,共6页
journal of Computer Applications
基金
国家自然科学基金资助项目(62066008)
贵州省科学技术基金重点项目(黔科合基础[2020]1Z055)。
关键词
量刑情节识别
半监督学习
多标签分类
反绎学习
置信学习
sentencing circumstance recognition
semi-supervised learning
multi-label classification
abductive learning
confidence learning