摘要
随着图像处理,语音识别等人工智能技术的发展,很多学习方法尤其是采用深度学习框架的方法取得了优异的性能,在精度和速度方面有了很大的提升,但随之带来的问题也很明显,这些学习方法如果要获得稳定的学习效果,往往需要使用数量庞大的标注数据进行充分训练,否则就会出现欠拟合的情况而导致学习性能的下降.因此,随着任务复杂程度和数据规模的增加,对人工标注数据的数量和质量也提出了更高的要求,造成了标注成本和难度的增大.同时,单一任务的独立学习往往忽略了来自其它任务的经验信息,致使训练冗余重复和学习资源的浪费,也限制了其性能的提升.为了缓解这些问题,属于迁移学习范畴的多任务学习方法逐渐引起了研究者的重视.与单任务学习只使用单个任务的样本信息不同,多任务学习假设不同任务数据分布之间存在一定的相似性,在此基础上通过共同训练和优化建立任务之间的联系.这种训练模式充分促进任务之间的信息交换并达到了相互学习的目的,尤其是在各自任务样本容量有限的条件下,各个任务可以从其它任务获得一定的启发,借助于学习过程中的信息迁移能间接利用其它任务的数据,从而缓解了对大量标注数据的依赖,也达到了提升各自任务学习性能的目的.在此背景之下,本文首先介绍了相关任务的概念,并按照功能的不同对相关任务的类型进行划分,之后对它们的特点进行了逐一描述.然后,本文按照数据的处理模式和任务关系的建模过程不同将当前的主流算法划分为两大类:结构化多任务学习算法和深度多任务学习算法.其中,结构化多任务学习算法采用线性模型,可以直接针对数据进行结构假设并且使用原有标注特征表述任务关系,同时,又可根据学习对象的不同将其细分为基于任务层面和基于特征层面两种不同结构,每种�
With the development of artificial intelligence technology such as image processing and speech recognition,many learning methods,especially those using deep learning frameworks,have achieved excellent performance and greatly improved accuracy and speed,but the problems are also obvious,if these learning methods want to achieve a stable learning effect,they often need to use a large number of labeled data to train adequately.Otherwise,there will be an underfitting situation which will lead to the decline of learning performance.Therefore,with the increase of task complexity and data scale,higher requirements are put forward for the quantity and quality of manual labeling data,resulting in the increase of labeling cost and difficulty.At the same time,the independent learning of single task often ignores the experience information from other tasks,which leads to redundant training and waste of learning resources,and also limits the improvement of its performance.In order to alleviate these problems,the multi-task learning method,which belongs to the category of transfer learning,has gradually attracted the attention of researchers.Unlike single-task learning,which only uses sample information of a single task,multi-task learning assumes that there is a certain similarity between the data distribution of different tasks.On this basis,the relationship between tasks is established through joint training and optimization.This training mode fully promotes information exchange between tasks and achieves the goal of mutual learning.Especially under the condition that the sample size of each task is limited,each task can get some inspiration from other tasks.With the help of information transfer in the learning process,the data of other tasks can be indirectly utilized.Thus,the dependence on a large number of labeled data is alleviated,and the goal of improving the performance of task learning is also achieved.Under this background,this paper first introduces the concept of related tasks,and describes their characteristics
作者
张钰
刘建伟
左信
ZHANG Yu;LIU Jian-Wei;ZUO Xin(Department of Automation,China University of Petroleum,Beijing 102249)
出处
《计算机学报》
EI
CSCD
北大核心
2020年第7期1340-1378,共39页
Chinese Journal of Computers
基金
国家重点研发计划项目(2016YFC0303703-03)
中国石油大学(北京)年度前瞻导向及培育项目(2462018QZDX02)资助.
关键词
多任务学习
信息迁移
任务相似性
贝叶斯生成式模型多任务学习
判别式多任务学习
深度多任务学习
multi-task learning
information transfer
similarity of tasks
Bayesian generative model of multi-task learning
discriminant approach of multi-task learning
deep multi-task learning via deep neural network