摘要
针对刑事判决书文本,结合刑事审判本体,构建基于本体的案例自动抽取与标注模型。基于法律案例文本的半结构化特征,依据文档组织结构和线索词,运用正则表达式构建抽取规则模板;同时结合自然语言处理技术进行相关语义信息的精准抽取。运用语义标注技术构建刑事审判本体实例库,实现大量案例文本向语义信息网络的转化,便于运用语义信息进行相似案例检索和审判推荐。实验证明,该模型的抽取结果基本达到预期效果。
This paper constructs an Ontology - based automatic extraction and annotation model for the massive texts of criminal judgments combined with the case - Ontology. It uses regular expressions to construct extraction rules and templates for the semi - structured characteristics of the texts of legal cases, according to the structure of the documents and the clue words. Besides, it applies natural language processing techniques for the accurate information extraction, then gives semantic annotation of the results of extraction for building an Ontology knowledge base of legal cases, to realize the transformation of case texts to semantic information Web, for the further similar case retrieval and judge recommendation.And the experiment shows a good result.
出处
《现代图书情报技术》
CSSCI
北大核心
2013年第6期23-29,共7页
New Technology of Library and Information Service
关键词
语义标注
本体
规则抽取
自然语言处理
Semantic annotation
Ontology
Rule extraction
Natural language processing