期刊文献+

一种基于距离和采样机制的数据流分类方法 被引量:1

Data streams classification approach based on distance and sampling
下载PDF
导出
摘要 数据流分类在传感器网络、网络监控等实际领域有着广泛的应用,然而,实际数据流中类分布不平衡和类标签大量缺失的问题严重加剧了数据流分类问题求解的难度。因此,针对数据流中类分布不平衡和类标签大量缺失的问题,提出了一种基于距离和采样机制的集成分类方法。该方法首先计算无标签数据与有标签正负类数据块的中心点距离来标记正负类示例,然后通过正类样本的上采样和负类样本的下采样机制重组数据流块以平衡数据块的类分布,并在其上构建集成分类模型。在模拟的具有类分布不平衡的不完全标记数据流上的实验表明,与经典的同类算法相比,所提方法能够在降低不平衡类分布影响的前提下,提高不完全标记数据流的分类精度。 Data stream classification is widely used in sensor networks,network monitoring and other real-world applications.However,the problem of class imbalance and label missing in data stream greatly aggravates the difficulty of data stream classification.Therefore,this paper proposed an ensemble classification method based on distance evaluation and sampling to solve the problem of incomplete labeled data stream classification with imbalanced class distribution.The proposed method first calculated the distance between the unlabeled data and the center point of the labeled data chunks to partition the positive and negative instances.Secondly,in order to balance the class distribution of the current data chunk,the data chunk was reconstructed by over-sampling positive instances and under-sampling negative instances,and then it was used to build an ensemble classification model.Experiments on the simulated incomplete labeled data stream with class imbalance show that the proposed method can improve the classification accuracy while reducing the influence of imbalanced class distribution as compared with the classical similar algorithm.
作者 胡学钢 何俊宏 李培培 Hu Xuegang;He Junhong;Li Peipei(School of Computer&Information,Hefei University of Technology,Hefei 230009,China)
出处 《计算机应用研究》 CSCD 北大核心 2018年第4期992-995,1000,共5页 Application Research of Computers
基金 国家重点研发计划项目(2016YFC0801406) 国家自然科学基金青年基金资助项目(61503112) 国家自然科学基金资助项目(61673152)
关键词 分类 集成学习 类分布不平衡 类标签缺失 classification ensemble learning class imbalance label missing
  • 相关文献

同被引文献2

引证文献1

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部