摘要
事件检测与分类任务,包含两个步骤的子任务:识别事件触发词和将其分类为正确的事件类型。在这项任务中首要关键的就是触发词的识别,利用基于神经网络的模型来识别句子中的触发词是这些年的主流方法。然而,当涉及到由语义结构不清和语义相近的字符和词组组成的句子时,识别事件的触发词变得有些困难。本文提出一个融合字与词信息,再通过原型网络来精确事件分类的模型:输入融合字与词的信息的嵌入信息,将各个组成的嵌入信息投影到一个高维的特征空间中,对于每个维度类型的样本信息提取他们的均值作为聚类中心即原型,使用欧几里得距离作为距离度量,训练使得测试样本到自己类别原型的距离越近越好,到其他类别原型的距离越远越好,更精确地识别出句子所包含的触发词,分辨出事件类型。
The event detection and classification task consists of two-step subtasks: identifying the event trigger word and classifying it into the correct event type. The most important thing in this task is the recognition of trigger words. Using neural network-based models to identify trigger words in sentences is the mainstream method in these years. However, when it comes to sentences composed of characters and phrases with unclear semantic structure and similar semantics, it becomes difficult to identify the trigger words of the event. This paper proposes to train an n-dimensional prototype network that integrates the embedded information of the word information: input the embedded information of the fused word and word information, and project the embedded information of each composition into a high-dimensional feature space. For each dimension type, the sample information extracts their mean value as the cluster center or prototype, and uses the Euclidean distance as the distance metric. Training makes the test sample the closer to the prototype of its own category, the better, and the farther the distance to prototypes of other categories, the better. Accurately identify the trigger words contained in the sentence and distinguish the type of event.
出处
《计算机科学与应用》
2021年第4期920-927,共10页
Computer Science and Application