摘要
标签比例学习问题是一项仅使用样本标签比例信息去构建分类模型的挖掘任务,由于训练样本不充分,现有方法将该问题视为单一任务,在文本分类中的表现并不理想。考虑到迁移学习在一定程度上能解决训练数据不充分的问题,于是如何利用历史数据(原任务数据)帮助新产生的数据(目标任务数据)进行分类显得异常重要。本文提出了一种基于标签比例信息的迁移学习算法,将知识从原任务迁移到目标任务,帮助目标任务更好构建分类器。为了获得迁移学习模型,该方法将原始优化问题转换为凸优化问题,然后解决对偶优化问题为目标任务建立准确的分类器。实验结果表明,大部分条件下所提算法性能优于传统方法。
The learning with label proportions problem is a learning task that only uses bag’s label propor-tions information to build a classification model. Due to insufficient training samples, the existing methods that viewed the above problem as single task did not perform well in text classification. To some extent, transfer learning can solve the problem of insufficient training data, the problem that how to use historical data (the original task data) to help the newly generated data (target task data) to classify becomes extremely important. This paper presents a label proportion information-based transfer learning approach to transfer knowledge from the source task to the target task, helping the target task to build a classifier. In order to obtain the transfer learning model, this method converted the original optimization problem into a convex optimization problem, and then solved the dual optimization problem to establish an accurate classifier for the target task. Extensive experiments have shown that the proposed method outperforms the traditional methods.
出处
《计算机科学与应用》
2020年第2期340-349,共10页
Computer Science and Application
基金
国家自然科学基金资助项目(61876044).