摘要
在半监督学习训练的过程中,由于分类器对噪声的引入使得分类器性能下降而影响分类准确性,本文提出一种具有自我调节的二次伪迭代算法。该算法延用Tri-training算法的3个分类器思想,在一定条件下引入少量的人工作业,从而避免一些标记难分类而影响训练的进行,并且采用自我调节功能,用于减少在分类过程中出现的噪声数据和降低对分类器性能提高无贡献数据的加入,同时运用二次伪迭代训练过程用于提高未标记样本的利用率和贡献值。通过实验和结果数据验证,该算法能有效改良分类器的性能和提高未标记样本的利用率及贡献值,分类的准确性得到一定提高。
In the semi-supervised learning process,the veracity of classification is affected because the classifier introduces the noise data to the training course.This paper proposes a kind of self-regulation and twice fake-iterative algorithm,which still uses the three classifier of tri-training algorithm.A small amount of manual work will be introduced under certain conditions to make the training process going on,thus,to avoid the difficulty in the classification of some labels.The self-regulatory function is also used to reduce the noise data and noncontributory data to be added in the classification process.Mean while,the utilization and contribution of unlabeled samples is improved by using twice fake-iterative.The experiment and the results show that this algorithm can effectively improve the classification performance,and the utilization and contribution of unlabeled samples.The veracity of classification is improved obviously.
出处
《广西师范大学学报(自然科学版)》
CAS
北大核心
2011年第3期110-114,共5页
Journal of Guangxi Normal University:Natural Science Edition
基金
中科院软件所开放课题基金资助项目(SYSKF0701)
国家自然科学基金资助项目(61070062)