摘要
以委员会投票查询算法为基础,提出在采样过程中动态修正分类器成员权值的加权投票方法。在对无标签样本标注价值评估中,该方法能够强化高精度分类器成员的查询贡献,降低高误差成员的投票影响,减少机器训练过程中的标注学习次数。通过在UCI的Statlog(Australian Credit Approval)数据集上对用户信用度级别进行识别,并比较于其他采样方法,证明该方法能够用较小的采样标注代价获取稳定的泛化精度。
In this paper, a method of weighted voting is proposed which can adjust weights of classifiers in committee during the sampling process and it is based on query by committee algorithm. In process of unlabeled sample’s quality evaluation, the method can strengthen the contribution of high precision members, reduce the influence of high error mem-bers and decrease the times of learning which is needed in machine training. By experiment on dataset of Statlog(Austra-lian Credit Approval)and compared results with other methods, the effectiveness has been proved that the algorithm can gain stable generalization accuracy with smaller costs of samples labeling.
出处
《计算机工程与应用》
CSCD
2014年第21期259-263,共5页
Computer Engineering and Applications
基金
安徽省教育厅高等学校自然科学研究重点项目(No.KJ2012A211)
关键词
主动学习
采样查询
加权投票
熵
标注门槛
active leaming
sampling query
weighted voting
entropy
labeling threshold