摘要
多标签文本分类是自然语言处理领域的重要任务之一.文本的标签语义信息与文本的文档内容有紧密的联系,而传统的多标签文本分类方法存在忽略标签的语义信息以及标签的语义信息不足等问题.针对以上问题,提出一种融合标签嵌入和知识感知的多标签文本分类方法 LEKA (Label Embedding and Knowledge-Aware).该方法依赖于文档文本以及相应的多个标签,通过标签嵌入来获取与标签相关的注意力.考虑标签的语义信息,建立标签与文档内容的联系,将标签应用到文本分类中.另外,为了增强标签的语义信息,通过知识图谱嵌入引入外部感知知识,对标签文本进行语义扩展.在AAPD和RCV1-V2公开数据集上与其他分类模型进行了对比,实验结果表明,与LCFA (Label Combination and Fusion of Attentions)模型相比,LEKA的F1分别提高了3.5%和2.1%.
Multi-label text classification is one of the most important tasks in natural language processing.The label semantic information of the text is closely related to the document content of the text.However,traditional multi-label text classification methods have some problems,such as ignore the semantic information of the labels itself and insufficient semantic information of the labels.In response to the above problems,we propose a multi-label text classification method LEKA(Label Embedding and Knowledge-Aware).LEKA relies on the document text and multiple labels,obtains attention related to labels through label embedding,considers the semantic information of labels,the relationship between the labels and the content of the established document,and applies labels to text classification.In addition,to enhance the semantic information of the labels,the embedding of knowledge graph is used to introduced external aware knowledge,expanding the semantic information of label text.Compared with other classification models on AAPD and RCV1-V2 open data sets,excessive experimental results show that compared with the LCFA(Label Combination and Fusion of Attentions)model,the proposed method improves the F1 value by 3.5%and 2.1%respectively.
作者
冯海
马甲林
许林杰
杨宇
谢乾
Hai Feng;Jialin Ma;Linjie Xu;Yu Yang;Qian Xie(Faculty of Computer and Software,Huaiyin institute of Technology,Huaian,223001,China;Jiangsu Eazytec Company Limited,Wuxi,214200,China)
出处
《南京大学学报(自然科学版)》
CAS
CSCD
北大核心
2023年第2期273-281,共9页
Journal of Nanjing University(Natural Science)
基金
国家自然科学基金(61602202)。
关键词
多标签文本分类
标签嵌入
知识图谱
注意力机制
multi-label text classification
label embedding
knowledge graph
attention mechanism