摘要
Text format information is full of most of the resources of Internet,which puts forward higher and higher requirements for the accuracy of text classification.Therefore,in this manuscript,firstly,we design a hybrid model of bidirectional encoder representation from transformers-hierarchical attention networks-dilated convolutions networks(BERT_HAN_DCN)which based on BERT pre-trained model with superior ability of extracting characteristic.The advantages of HAN model and DCN model are taken into account which can help gain abundant semantic information,fusing context semantic features and hierarchical characteristics.Secondly,the traditional softmax algorithm increases the learning difficulty of the same kind of samples,making it more difficult to distinguish similar features.Based on this,AM-softmax is introduced to replace the traditional softmax.Finally,the fused model is validated,which shows superior performance in the accuracy rate and F1-score of this hybrid model on two datasets and the experimental analysis shows the general single models such as HAN,DCN,based on BERT pre-trained model.Besides,the improved AM-softmax network model is superior to the general softmax network model.
作者
ZHAO Yuanyuan
GAO Shining
LIU Yang
GONG Xiaohui
赵媛媛;高世宁;刘洋;宫晓蕙(College of Information Science and Technology,Donghua University,Shanghai 201620,China;Engineering Research Center of Digitized Textile&Apparel Technology,Ministry of Education,Donghua University,Shanghai 201620,China)
基金
Fundamental Research Funds for the Central University,China(No.2232018D3-17)。