摘要
当现有训练数据过期,而新数据又非常少时,运用迁移学习能够有效提高分类器性能。本文提出一种基于聚类的文本迁移学习算法,给出了算法的主要思想及实现步骤。然后,在中文文本语料库上进行了实验,并与非迁移学习算法进行了比较。实验证明该方法能有效提高分类器性能。
Transfer learning can improve the performance of classifier effectively, when the training data are out of date, but the new data are very few. In this paper, we propose a transfer learning algorithm for text classification based on clustering. We describe the main idea and the step of the algorithm. Then have experiment on text corpus of Chinese, and compare the algorithm with transfer-unaware algorithm. The experiments demonstrate that this algorithm significantly outperforms the others.
出处
《计算机系统应用》
2010年第12期238-241,共4页
Computer Systems & Applications
基金
国家自然科学基金(60873100)
关键词
训练数据过期
新数据非常少
迁移学习
聚类
文本
training data are out of date
new data are very few
transfer learning
clustering
text