摘要
随机森林(RF)是最经典的机器学习算法之一,并已获得广泛应用。然而观察发现,尽管现实中存在众多的二视图数据并已获得广泛的分析研究,但针对二视图场景的RF构建相当少,仅有的利用RF解决二视图学习问题的方法也都是先为各个视图生成各自的RF,在决策时才融合了视图间的信息。这样的方法存在一个显著不足是在其RF的构建阶段未利用两个视图间的相关性,这无疑浪费了信息资源。为了弥补这一不足,提出了一种改进的二视图随机森林(ITVRF)。具体而言,在决策树的生成过程中采用典型相关分析(CCA)进行视图融合,将视图间的信息交互融入到了决策树的构建阶段,实现了视图间互补信息在整个RF生成过程中的利用。此外,ITVRF还通过判别分析为决策树生成判别决策边界,更适合于分类。实验结果表明ITVRF比现有的二视图RF(TVRF)有着更优的准确率。
Random forest(RF)is one of the most classic machine learning methods,which has been widely used.However,although there are many two-view data in reality and extensive analytical research has been carried out,the RF construction for two-view scenarios is little.The only RF method for two-view learning first generates RF for each view respectively,and then merges the view information when making decisions.Therefore,it turns out an obvious disadvantage that the correlation between views is not utilized effectively during the RF construction stage,which undoubtedly wastes information resources.In order to make up for this disadvantage,an improved two-view RF(ITVRF)is proposed in this paper.Specifically,canonical correlation analysis(CCA)is used for view fusion in the process of generating decision trees,and the information interaction between views is embedded into the tree construction stage,realizing the utilization of complementary information between views in the entire RF generation process.In addition,ITVRF also generates discriminant decision boundaries for decision trees through discriminant analysis and thus makes it more suitable for classification.Experimental results show that ITVRF achieves better accuracy than existing two-view RF(TVRF).
作者
夏笑秋
陈松灿
XIA Xiaoqiu;CHEN Songcan(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China;MIIT Key Laboratory of Pattern Analysis and Machine Intelligence,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China)
出处
《计算机科学与探索》
CSCD
北大核心
2022年第1期144-152,共9页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金(61672281,61732006)。