Structure features need complicated pre-processing, and are probably domain-dependent. To reduce time cost of pre-processing, we propose a novel neural network architecture which is a bi-directional long-short-term-me...Structure features need complicated pre-processing, and are probably domain-dependent. To reduce time cost of pre-processing, we propose a novel neural network architecture which is a bi-directional long-short-term-memory recurrent-neural-network(Bi-LSTM-RNN) model based on low-cost sequence features such as words and part-of-speech(POS) tags, to classify the relation of two entities. First, this model performs bi-directional recurrent computation along the tokens of sentences. Then, the sequence is divided into five parts and standard pooling functions are applied over the token representations of each part. Finally, the token representations are concatenated and fed into a softmax layer for relation classification. We evaluate our model on two standard benchmark datasets in different domains, namely Sem Eval-2010 Task 8 and Bio NLP-ST 2016 Task BB3. In Sem Eval-2010 Task 8, the performance of our model matches those of the state-of-the-art models, achieving 83.0% in F1. In Bio NLP-ST 2016 Task BB3, our model obtains F1 51.3% which is comparable with that of the best system. Moreover, we find that the context between two target entities plays an important role in relation classification and it can be a replacement of the shortest dependency path.展开更多
Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine th...Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.展开更多
基金Supported by the China Postdoctoral Science Foundation(2014T70722)the Humanities and Social Science Foundation of Ministry of Education of China(16YJCZH004)
文摘Structure features need complicated pre-processing, and are probably domain-dependent. To reduce time cost of pre-processing, we propose a novel neural network architecture which is a bi-directional long-short-term-memory recurrent-neural-network(Bi-LSTM-RNN) model based on low-cost sequence features such as words and part-of-speech(POS) tags, to classify the relation of two entities. First, this model performs bi-directional recurrent computation along the tokens of sentences. Then, the sequence is divided into five parts and standard pooling functions are applied over the token representations of each part. Finally, the token representations are concatenated and fed into a softmax layer for relation classification. We evaluate our model on two standard benchmark datasets in different domains, namely Sem Eval-2010 Task 8 and Bio NLP-ST 2016 Task BB3. In Sem Eval-2010 Task 8, the performance of our model matches those of the state-of-the-art models, achieving 83.0% in F1. In Bio NLP-ST 2016 Task BB3, our model obtains F1 51.3% which is comparable with that of the best system. Moreover, we find that the context between two target entities plays an important role in relation classification and it can be a replacement of the shortest dependency path.
基金National Natural Science Foundation of China(Grant Nos.62376166,62306188,61876113)National Key R&D Program of China(No.2022YFC3303504).
文摘Discourse relation classification is a fundamental task for discourse analysis,which is essential for understanding the structure and connection of texts.Implicit discourse relation classification aims to determine the relationship between adjacent sentences and is very challenging because it lacks explicit discourse connectives as linguistic cues and sufficient annotated training data.In this paper,we propose a discriminative instance selection method to construct synthetic implicit discourse relation data from easy-to-collect explicit discourse relations.An expanded instance consists of an argument pair and its sense label.We introduce the argument pair type classification task,which aims to distinguish between implicit and explicit argument pairs and select the explicit argument pairs that are most similar to natural implicit argument pairs for data expansion.We also propose a simple label-smoothing technique to assign robust sense labels for the selected argument pairs.We evaluate our method on PDTB 2.0 and PDTB 3.0.The results show that our method can consistently improve the performance of the baseline model,and achieve competitive results with the state-of-the-art models.