摘要
跨境民族是指居住地“跨越”了国境线,但又保留了原来共同的某些民族特色,彼此有着同一民族的认同感的民族,对于跨境民族文化中涉及到的文本分类问题可以看作领域文本细分类任务,但是,目前面临类别标签歧义的问题.为此提出一种融合领域知识图谱的跨境民族文化分类方法.首先把知识图谱中的知识三元组通过TransE模型表示为实体语义向量,并且把实体语义向量与BERT预训练模型得到文本中的词语向量相融合得到增强后的文本语义表达,输入到BiGRU神经网络中进行深层语义特征提取;然后通过构建注意力权重矩阵,对特征进行权重分配,以此来提升特征的质量,最终完成跨境民族文化分类模型的训练.实验结果表明,提出的方法在跨境民族文化文本数据集上的F1值为89.6%,精确率和召回率分别为88.2%和90.1%.
Cross-border ethnic cultural classification can be regarded as a sub-category task of domain text,but it faces the problem of category label ambiguity.This article proposes a cross-border ethnic culture classification method based on domain knowledge map.First,the knowledge triples in the knowledge graph are expressed as entity semantic vectors through the TransE model,and the entity semantic vectors are combined with the word vectors in the text obtained by the BERT pre-training model to obtain an enhanced text semantic expression,which is input into the BiGRU neural network Perform deep semantic feature extraction in the middle;then build an attention weight matrix to assign weights to features to improve the quality of features,and finally complete the training of crossborder ethnic cultural classification models.The experimental results show that the F1value of the proposed method on the cross-border ethnic cultural text data set is89.6%,and the accuracy and recall rates are88.2%and90.1%,respectively.
作者
毛存礼
王斌
雷雄丽
满志博
王红斌
张亚飞
MAO Cun-li;WANG Bin;LEI Xiong-li;MAN Zhi-bo;WANG Hong-bin;ZHANG Ya-fei(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650000,China;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming 650000,China;Kunming Metallurgical College,Kunming 650000,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2022年第5期943-949,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(61662041,61866019)资助
云南省自然科学基金重点项目(2019FA023)资助
云南省中青年学术和技术带头人后备人才合同项目(2019HB006)资助。