摘要
提出了充分利用未标注样本的样本信息的双语对抗学习方法。具体而言,中文的标注样本和未标注样本分别通过不同的LSTM进行编码,再经过分类器和判别器进行对抗学习。其中,分类器的作用是使标注样本和未标注样本处于同一分布,而判别器用来区分输入样本是标注样本还是未标注样本。最后,构建一个相同的英文语料的对抗神经网络,通过联合学习中英文对抗神经网络提升半监督情感分类的性能。实验结果表明,所提出的基于双语对抗学习的半监督情感分类方法在不同标注样本数量的训练集上都取得了较好的准确率,与其他基准方法相比有明显提升。
A bilingual adversarial learning approach was proposed to make full use of the information of unlabeled samples.Specifically,the labeled and unlabeled Chinese samples were encoded by independent LSTMs,and then fed into classifier and discriminator.The function of classifier was to make the labeled samples and unlabeled in the same distribution,while the discriminator was used to distinguish whether the input sample was labeled and unlabeled.Finally,another adversarial neural network with the English samples was constructed,and the performance of semi-supervised sentiment classification was expected to be improved through the joint learning of Chinese and English adversarial networks.Empirical studies showed that the proposed approach achieved good accuracy on different sizes of training sets,and demonstrated the significant improvement compared to other baselines.
作者
刘杰
刘欢
李寿山
闫伟
LIU Jie;LIU Huan;LI Shoushan;YAN Wei(Institute of Information Engineering, Suqian College, Suqian 223800, China;School of Computer Science & Technology, Soochow University, Suzhou 215006, China)
出处
《郑州大学学报(理学版)》
CAS
北大核心
2020年第2期59-63,共5页
Journal of Zhengzhou University:Natural Science Edition
基金
宿迁市科技计划项目(Z2018225,S201712)
国家自然科学基金项目(61331011,61375073)。
关键词
未标注样本
双语对抗学习
半监督情感分类
unlabeled samples
bilingual adversarial learning
semi-supervised sentiment classification