摘要
为了解决飞机燃油消耗预测过程中的数据不平衡问题,传统SMOTE方法对少数类随机构造伪样本,从而导致了数据分布的整体变化和模糊了区间边界。针对以上问题,提出一种基于k-medoids的改进SMOTE算法,即KMSMOTE,并以随机森林作为分类器进行爬升段油耗分类。该方法使用k-medoids对少数类进行聚类操作,在聚类簇的基础上使用SMOTE构造伪样本,确保分类结果不会偏向多数类;应用随机森林算法生成分类器。选取国内同一航线、同一机型的多个航班数据为实验样本,实验结果表明,改进后的算法分类效果更好。
In order to solve the problem of data imbalance in the prediction of aircraft fuel consumption, the traditional SMOTE randomly constructs a few pseudo-samples, which leads to the overall change of data distribution and the blurs of the interval boundary. Aiming at the above problems, we proposed an improved SMOTE based on k-medoids, namely KMSMOTE, and random forest was used as classifier to classify the fuel consumption in the climbing phase. In this method, k-medoids was used to cluster a few classes, and SMOTE was applied to construct pseudo-samples on the basis of clustering, so as to ensure that the classification results were not biased towards most classes. We used the random forest to generate the classifier. Multiple flight data of the same airline and the same aircraft type in the domestic were selected as experimental samples. The experimental results show that the improved algorithm has better classification effect.
作者
陈静杰
崔金成
Chen Jingjie;Cui Jincheng(College qf Electronic Information and Automation , Civil Aviation University of China, Tianjin 300300, China)
出处
《计算机应用与软件》
北大核心
2019年第4期247-250,316,共5页
Computer Applications and Software
基金
国家科技支撑计划项目(2012BAC20B0304)