Despite significant successes achieved in knowledge discovery,traditional machine learning methods may fail to obtain satisfactory performances when dealing with complex data,such as imbalanced,high-dimensional,noisy ...Despite significant successes achieved in knowledge discovery,traditional machine learning methods may fail to obtain satisfactory performances when dealing with complex data,such as imbalanced,high-dimensional,noisy data,etc.The reason behind is that it is difficult for these methods to capture multiple characteristics and underlying structure of data.In this context,it becomes an important topic in the data mining field that how to effectively construct an efficient knowledge discovery and mining model.Ensemble learning,as one research hot spot,aims to integrate data fusion,data modeling,and data mining into a unified framework.Specifically,ensemble learning firstly extracts a set of features with a variety of transformations.Based on these learned features,multiple learning algorithms are utilized to produce weak predictive results.Finally,ensemble learning fuses the informative knowledge from the above results obtained to achieve knowledge discovery and better predictive performance via voting schemes in an adaptive way.In this paper,we review the research progress of the mainstream approaches of ensemble learning and classify them based on different characteristics.In addition,we present challenges and possible research directions for each mainstream approach of ensemble learning,and we also give an extra introduction for the combination of ensemble learning with other machine learning hot spots such as deep learning,reinforcement learning,etc.展开更多
重叠社区发现是近年来复杂网络领域的研究热点之一.提出一种半监督的局部扩展式重叠社区发现方法SLEM(semi-supervised local expansion method).该方法借鉴了带约束的半监督聚类的思想,不仅利用网络的拓扑结构信息,还充分地利用网络节...重叠社区发现是近年来复杂网络领域的研究热点之一.提出一种半监督的局部扩展式重叠社区发现方法SLEM(semi-supervised local expansion method).该方法借鉴了带约束的半监督聚类的思想,不仅利用网络的拓扑结构信息,还充分地利用网络节点的属性信息.首先将网络节点的属性信息转化为成对约束,并根据成对约束修正网络的拓扑结构,使网络中的社区结构更加明显;然后基于网络节点的度中心性选取种子节点,得到分散的、局部节点度大的种子作为初始社区;再采用贪心策略将初始社区向邻居节点扩展,得到局部连接紧密的社区;最后检测并合并冗余社区,得到高覆盖率的社区发现结果.在模拟网络数据和真实网络数据上与当前有代表性的基于局部扩展的重叠社区发现算法进行了对比实验,结果表明SLEM方法在稀疏程度不同的网络上均能发现较高质量的重叠社区结构.展开更多
基金Supported by the National Natural Science Foundation of China under Grant No.60875031(国家自然科学基金)the National Basic Research Program of China under Grant No.2007CB311002(国家重点基础研究发展计划(973))+2 种基金the Program for New Century Excellent Talents in University of china under Grant No.NECT-06-0078(新世纪优秀人才支持计划)the Research Fund for the Doctoral Program of Higher Education of the Ministry of Education of China under Grant No.20050004008(教育部高等学校博士学科点专项科研基金)the Fok Ying-Tbng Education Foundation for Young Teachers in the Higher Education Instirutions of China under Grant No.101068(霍英东教育基金会高等院校青年教师基金)
基金Supported by the National Natural Science Foundation of China under Grant Nos.60372050 60372045 (国家自然科学基金)the National Basic Research Program of China under Grant No.2001CB309403 (国家重点基础研究发展计划(973))
基金the National Natural Science Foundation of China(Grant Nos.61722205,61751205,61572199,61502174,61872148,and U 1611461)the grant from the key research and development program of Guangdong province of China(2018B010107002)+1 种基金the grants from Science and Technology Planning Project of Guangdong Province,China(2016A050503015,2017A030313355)the grant from the Guangzhou science and technology planning project(201704030051).
文摘Despite significant successes achieved in knowledge discovery,traditional machine learning methods may fail to obtain satisfactory performances when dealing with complex data,such as imbalanced,high-dimensional,noisy data,etc.The reason behind is that it is difficult for these methods to capture multiple characteristics and underlying structure of data.In this context,it becomes an important topic in the data mining field that how to effectively construct an efficient knowledge discovery and mining model.Ensemble learning,as one research hot spot,aims to integrate data fusion,data modeling,and data mining into a unified framework.Specifically,ensemble learning firstly extracts a set of features with a variety of transformations.Based on these learned features,multiple learning algorithms are utilized to produce weak predictive results.Finally,ensemble learning fuses the informative knowledge from the above results obtained to achieve knowledge discovery and better predictive performance via voting schemes in an adaptive way.In this paper,we review the research progress of the mainstream approaches of ensemble learning and classify them based on different characteristics.In addition,we present challenges and possible research directions for each mainstream approach of ensemble learning,and we also give an extra introduction for the combination of ensemble learning with other machine learning hot spots such as deep learning,reinforcement learning,etc.
基金Supported by the National Natural Science Foundation of China under Grant Nos.60702033 60772076 (国家自然科学基金)+3 种基金the National High-Tech Research and Development Plan of China under Grant No.2007AA01Z171 (国家高技术研究发展计划(863)the Science Fund for Distinguished Young Scholars of Heilongjiang Province of China under Grant No.JC200611 (黑龙江省杰出青年科学基金)the Natural Science Foundation of Heilongjiang Province of China under Grant No.ZJG0705 (黑龙江省自然科学重点基金)the Foundation of Harbin Institute of Technology of China under Grant No.HIT.2003.53 (哈尔滨工业大学校基金)
基金Supported by the National Natural Science Foundation of China under Grant Nos.6049632160773099+3 种基金60573073(国家自然科学基金)the National High-Tech Research and Development Plan of China under Grant Nos.2006AA10Z244 2006AA10A309(国家高技术研究发展计划(863))the Science and Technology Development Plan of Jilin Province of China under Grant No.20030523(吉林省科技发展计划)the European Commission under Grant No.TH/Asia Link/010(111084)(欧盟项目)
文摘重叠社区发现是近年来复杂网络领域的研究热点之一.提出一种半监督的局部扩展式重叠社区发现方法SLEM(semi-supervised local expansion method).该方法借鉴了带约束的半监督聚类的思想,不仅利用网络的拓扑结构信息,还充分地利用网络节点的属性信息.首先将网络节点的属性信息转化为成对约束,并根据成对约束修正网络的拓扑结构,使网络中的社区结构更加明显;然后基于网络节点的度中心性选取种子节点,得到分散的、局部节点度大的种子作为初始社区;再采用贪心策略将初始社区向邻居节点扩展,得到局部连接紧密的社区;最后检测并合并冗余社区,得到高覆盖率的社区发现结果.在模拟网络数据和真实网络数据上与当前有代表性的基于局部扩展的重叠社区发现算法进行了对比实验,结果表明SLEM方法在稀疏程度不同的网络上均能发现较高质量的重叠社区结构.