期刊文献+

面向不平衡高光谱遥感分类的SMOTE和旋转森林动态集成算法 被引量:3

Dynamic ensemble algorithm of SMOTE and rotation forest for imbalanced hyperspectral remote sensing classification
原文传递
导出
摘要 旋转森林RoF(Rotation Forest)是一种功能强大的集成分类器,它在高光谱图像分类中已经获得了很多成功的应用。然而,现实数据经常存在类别不平衡的问题,这使得传统的RoF算法侧重识别多数类别的样本,而忽略了少数类样本的分类精度。SMOTE(Synthetic Minority Oversampling Technique)算法通过模拟生成新样本的方式来增加少数类别样本的数量,进而达到平衡数据集类别的效果;但是SMOTE算法目前主要被用于数据预处理阶段,并且在处理多类问题时具有增加人工噪声的风险。为了解决高光谱数据学习中的多类不平衡问题,本文提出了一个新的SMOTE和RoF动态集成算法;该算法利用动态采样因子技术,将类别分布优化和基分类器训练过程进行融合。本实验利用Indian Pines、Salinas以及Pavia University这3个公开的高光谱数据对新的SMOTE和RoF动态集成算法的性能进行测试,同时选取4种对比算法,包括随机森林、传统的RoF以及通过随机过采样和SMOTE数据预处理后的RoF算法,并且采用总体分类精度、平均分类精度、F-measure、Gmean、最小召回率、集成分类器多样性、模型训练时间以及McNemar测试等为算法性能评价标准。实验结果表明本文方法具有明显的分类优势,可以保证在增加数据总体分类精度的基础上提高小类别样本的识别精度。 Rotation Forest(RoF),a powerful ensemble classifier,has obtained many successful applications in hyperspectral image classification.However,the data often has the problem of class imbalance.Consequently,the traditional RoF algorithm focuses on identifying the classes with majority samples,ignoring the accuracy of minority samples.The SMOTE(Synthetic Minority Oversampling Technique)algorithm increases the number of minority samples by simulating the way of generating new samples,thereby achieving the effect of balancing the categories of the data set.However,the SMOTE algorithm is mainly used in the data preprocessing stage and has the risk of increasing artificial noise when dealing with multi-class problems.Therefore,a novel dynamic ensemble algorithm based on SMOTE and RoF is proposed in this work to increase the classification accuracy of the multi-class imbalanced hyperspectral data.The proposed algorithm uses a dynamic sampling factor technology to merge the class distribution optimization with the base classifier.This algorithm not only realizes the adaptive generation of class balance data set but also reduces the influence of noise on the base classifier.In this experiment,three public hyperspectral images are used to test the performance of the algorithm,They are Indian Pines,Salinas and Pavia University.Four comparison algorithms are also selected,including random forest,traditional RoF,RoF algorithm with random oversampling,and SMOTE data preprocessing.The overall accuracy,average accuracy,F-measure,Gmean,minimum recall rate,ensemble classifier diversity,model training time,and McNemar test are the algorithm evaluation criteria.The experimental results demonstrate the effectiveness of the proposed method.The novel method not only obtains obvious classification advantages but also increases the recognition accuracy of minority samples while maintaining the overall classification accuracy of the data.
作者 童莹萍 冯伟 宋怡佳 全英汇 黄文江 高连如 朱文涛 邢孟道 TONG Yingping;FENG Wei;SONG Yijia;QUAN Yinghui;HUANG Wenjiang;GAO Lianru;ZHU Wentao;XING Mengdao(Schoolof Electronic Engineering,Xidian University,Xi'an710071,China;Research Institute of Advanced Remote Sensing Technology,Xidian University,Xian 710071,China;Key Laboratory of Digital Earth Science,Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing100094,China;Academy of Advanced Interdisciplinary Research,Xidian University,Xi'an 710071,China)
出处 《遥感学报》 EI CSCD 北大核心 2022年第11期2369-2381,共13页 NATIONAL REMOTE SENSING BULLETIN
基金 国家自然科学基金(编号:61772397,12005169,62201438) 陕西省自然科学基础研究计划(编号:2021JC-23) 榆林市科技局科技发展专项(编号:CXY-2020-094) 陕西林业科技创新重点专项(编号:SXLK2022-02-8)。
关键词 集成学习 不平衡分类 旋转森林 SMOTE 动态采样 ensemble learning imbalanced classification rotation forest SMOTE dynamic sampling
  • 相关文献

参考文献4

二级参考文献173

共引文献305

同被引文献40

引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部