期刊文献+

基于特征增广的生成–判别混合模型构建方法 被引量:1

A feature augmentation-based method for constructing generative-discriminative hybrid models
原文传递
导出
摘要 从概率框架的角度来看,生成模型首先由数据学习联合概率分布,然后再求出条件概率分布,通常具有更快的收敛速度;而判别模型由数据直接学习条件概率分布,往往具有更高的准确率.生成–判别混合模型作为二者的有效结合,同时集成了它们的优点.然而,现有方法在构建混合模型时,需要将原始特征划分为两个独立的特征空间,分别用于训练生成模型和判别模型.特征划分不仅提升了模型的时间复杂度,还削弱了原始特征空间的表达能力.为了解决这一问题,本文提出了一种基于特征增广的生成–判别混合模型构建方法.该方法首先利用生成模型学习条件概率分布,然后将学到的条件概率分布作为新特征增广到原始特征空间中,最后在增广后的特征空间中训练判别模型并预测最终的分类结果.该方法利用特征增广的思想做模型混合,无需对原始特征进行划分,具有较低的时间复杂度,同时还增强了原始特征空间的表达能力.在36个经典UCI标准数据集上的实验结果表明,所提方法不仅具有有效性和通用性,还遵循了偏差–方差权衡原则. From the perspective of probability framework, the generative model first learns the joint probability distribution from the data and then calculates the conditional probability distribution with a faster convergence speed. However, the discriminative model learns the conditional probability distribution directly from the data,thus often demonstrating higher accuracy. As an effective combination of the generative and discriminative models, the generative-discriminative hybrid model integrates their advantages. However, the existing methods must divide the original features into two independent feature spaces to train the two models. Feature division not only increases the time complexity of the model but also weakens the expression ability of the original feature space. To solve this problem, this paper proposes a feature augmentation-based method for constructing the generative-discriminative hybrid model. First, this novel method uses the generative model to learn the conditional probability distribution. Then, it augments the learned conditional probability distribution as new features into the original feature space. Finally, it trains the discriminative model in the augmented feature space and predicts the final classification result. The new method offers several advantages, including using feature augmentation to mix the models, not requiring feature division, exhibiting low time complexity, and enhancing the expression ability of the original feature space. The experimental results on 36 classical UCI benchmark datasets show that the new method is not only effective and universal but also follows bias-variance trade-off.
作者 张文钧 蒋良孝 张欢 Wenjun ZHANG;Liangxiao JIANG;Huan ZHANGI(School of Computer Science,China University of Geosciences,Wuhan 430074,China;Hubei Key Laboratory of Intelligent Geo-Information Processing,Wuhan 430074,China)
出处 《中国科学:信息科学》 CSCD 北大核心 2022年第10期1792-1807,共16页 Scientia Sinica(Informationis)
基金 国家自然科学基金联合基金重点项目(批准号:U1711267) 中央高校基本科研业务费专项资金项目(批准号:CUGGC03)资助。
关键词 生成模型 判别模型 特征增广 条件概率分布 偏差–方差权衡 generative model discriminative model feature augmentation conditional probability distribution bias-variance trade-off
  • 相关文献

参考文献2

  • 1周志华著..机器学习[M].北京:清华大学出版社,2016:425.
  • 2李航..统计学习方法 第2版[M].北京:清华大学出版社,2019.

同被引文献13

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部