摘要
针对偏斜数据集的分类问题,提出一种改进的少数类样本过抽样算法(B-ISMOTE)。在边界少数类实例及其最近邻实例构成的n维球体空间内进行随机插值,以此产生虚拟少数类实例,减小数据的不均衡程度。在实际数据集上进行实验,结果证明,与SMOTE算法和B-SMOTE算法相比,B-ISMOTE算法具有较优的分类性能。
Aiming at the classification of the skewed dataset, this paper proposes an improved over-sampling algorithm of minority class sample, named B-ISMOTE. It improves the data unbalanced distribution of degree through randomized interpolation to produce virtual minority class instances in the sphere space, which constitute of the borderline minority class instances and its nearest neighbor. Experimental results on the real datasets show that compared with SMOTE algorithm and B-SMOTE algorithm, B-ISMOTE algorithm has better classification performance.
出处
《计算机工程》
CAS
CSCD
2012年第4期67-69,共3页
Computer Engineering
基金
国家自然科学基金资助项目(60873196)
关键词
偏斜数据集
分类
过抽样
虚拟实例
n维球体空间
skewed dataset
classification
over-sampling
virtual instance
n dimension sphere space