摘要
当未标记数据与有标记数据类别比例偏移较大时,半监督支持向量机性能不佳.基于此情况,文中提出面向类别比例偏移的半监督支持向量机方法.首先估计未标记数据类中心,然后对多个类别比例下的类中心进行最坏情况集成,从而提升半监督支持向量机的性能保障.实验表明,文中方法有效提升半监督支持向量机在类别比例偏移时的性能保障.
When the label proportion of unlabeled data is far away from that of labeled data, direct supervised support vector machine ( SVM ) with only labeled data outperforms semi-supervised SVM ( S^3VM ) with unlabeled data. Thus, a shifted label proportion aware S^3VM(fairS^3VM) is proposed. Specifically, the label mean of unlabeled data is firstly estimated. Then multiple label means corresponding to multiple label proportions are integrated under the worst-case scenario. Experimental results show that the performance guarantee of S3 VMs is effectively improved when the label proportion is shifted.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2016年第7期625-632,共8页
Pattern Recognition and Artificial Intelligence
基金
国家自然科学基金青年科学基金项目(No.61403186)
江苏省自然科学基金青年基金项目(No.BK20140613)资助~~
关键词
半监督学习
半监督支持向量机
类别比例偏移
集成方法
Semi-supervised Learning, Semi-supervised Support Vector Machine, Shifted Label Proportion, Ensemble Method