期刊文献+

基于XGBoost算法的恒星/星系分类研究 被引量:8

Research on Star/Galaxy Classi?cation Based on XGBoost Algorithm
下载PDF
导出
摘要 机器学习在当今的诸多领域已经取得了巨大的成功.尤其是提升算法.提升算法适应各种场景的能力较强、准确率较高,已经在多个领域发挥巨大的作用.但是提升算法在天文学中的应用却极为少见.为解决斯隆数字巡天(Sloan Digital Sky Survey,SDSS)数据中恒星/星系暗源集分类正确率低的问题,引入了机器学习中较新的研究成果–XGBoost (eXtreme Gradient Boosting).从SDSS-DR7 (SDSS Data Release 7)中获取完整的测光数据集,并根据星等值划分为亮源集和暗源集.首先,分别对亮源集和暗源集使用十折交叉验证法,同时运用XGBoost算法建立恒星/星系分类模型;然后,运用栅格搜索等方法调优XGBoost参数;最后,基于星系的分类正确率等指标,与功能树(Function Tree, FT)、Adaboost (Adaptive boosting)、随机森林(Random Forest, RF)、梯度提升决策树(Gradient Boosting Decision Tree, GBDT)、堆叠降噪自编码(Stacked Denoising AutoEncoders, SDAE)、深度置信网络(Deep Belief Network, DBN)等模型进行对比并分析结果.实验结果表明:XGBoost在暗源分类中要比功能树算法的星系分类正确率提高了将近10%,在暗源集的最暗星等中比功能树提高了将近5%.同其他传统的机器学习算法和深度神经网络相比, XGBoost也有不同程度的提升. Machine learning,especially the life algorithm,has achieved great success in many areas today.The lifting algorithm has a strong ability to adapt to various scenarios with high accuracy,and has played a great role in many fields.But in astronomy,the application of lifting algorithms is rare.In response to the low classification accuracy of dark source sets in star/galaxy in the Sloan Digital Sky Survey(SDSS),a new research result in machine learning,e Xtreme Gradient Boosting(XGBoost),was introduced.The complete photometric data set is obtained from the SDSS-DR7,and divided into a bright source set and a dark source set according to the magnitude.Firstly,the ten-fold cross-validation method is used for the bright source set and the dark source set respectively,and the XGBoost algorithm is used to establish the star/galaxy classification model.Then,the grid search and other methods are used to tune the XGBoost parameters.Finally,based on galaxies’classification accuracy and other indicators,the classification results are analyzed,comparing with the models of function tree(FT),Adaptive boosting(Adaboost),Random Forest(RF),Gradient Boosting Decision Tree(GBDT),Stacked Denoising AutoEncoders(SDAE),and Deep Belief Nets(DBN).The experimental results show that,the XGBoost improves the classification accuracy of galaxies in dark source classification by nearly 10%compared to the function tree algorithm,and improves the classification accuracy of galaxies in the darkest magnitude of dark source set by nearly 5%compared to the function tree algorithm.Compared with other traditional machine learning algorithms and deep neural networks,the XGBoost also has different degrees of improvement.
作者 李超 张文辉 林基明 LI Chao;ZHANG Wen-hui;LIN Ji-ming(College of Information and Communication Engineering,Guilin University of Electronic Technology,Guilin 541004;Key Laboratory of Cognitive Radio and Information Processing,the Ministry of Education,Guilin University of Electronic Technology,Guilin 541004;Guangxi Cooperative Innovation Center of Cloud Computing and Big Data,Guilin University of Electronic Technology,Guilin 541004;Guangxi Colleges and Universities Key Laboratory of Cloud Computing and Complex Systems,Guilin University of Electronic Technology,Guilin 541004)
出处 《天文学报》 CSCD 北大核心 2019年第2期73-82,共10页 Acta Astronomica Sinica
基金 广西云计算与大数据协同创新中心 广西高校云计算与复杂系统重点实验室项目(编号1716)资助
关键词 恒星:基本参数 星系:基本参数 技术:测光 方法:数据分析 stars:fundamental parameters galaxies:fundamental parameters techniques:photometric methods:data analysis
  • 相关文献

参考文献4

二级参考文献46

  • 1李丽丽,张彦霞,赵永恒,杨大卫.人工神经网络在天文学中的应用[J].天文学进展,2006,24(4):285-295. 被引量:5
  • 2Philip N S, Wadadekar Y, Kembhavi A, et al. A difference boosting neural network for automated star-galaxy classification. Astron Atrophys, 2002, 385:1119-1126. 被引量:1
  • 3Ball N M, Brunner R J, Myers A D. Robust machine learning applied to astronomical data sets. I. Star-galaxy classification of the sloan digital sky survey DR3 using decision trees. Astrophys J, 2006, 650:497-509. 被引量:1
  • 4Mahonen P, Frantti T. Fuzzy classifier for star-galaxy separation. Astrophys J, 2000, 541:261-263. 被引量:1
  • 5Moore J A, Pimbblet K A, Drinkwater M J. Mathematical morphology: Star/galaxy differentiation & galaxy morphology classification. Publ Astron Soc Austral, 2006, 23:135-146. 被引量:1
  • 6Cheeseman P, Stutz J. Bayesian classification (AutoClass): Theory and results. In: Fayyad U M, Piatetsky-Shapiro G, Smyth P, et al, eds. AAAI/MIT Press: Cambridge, Menlo Park: AAAI Press, 1996. 153-180. 被引量:1
  • 7York D G, Adelman J, Anderson J E, et al. The Sloan digital sky survey: Technical summary. Astron J, 2000, 120:1579-1587. 被引量:1
  • 8Petrosian V. Surface brightness and evolution of galaxies. Astrophys J, 1976, 209:L1-L5. 被引量:1
  • 9谢博文.自动分类软体在动作电位上的研究.硕士学位论文.台北:中央大学,2006.26-31. 被引量:1
  • 10Strauss M A, Weinberg D H, Lupton R H, et al. Spectroscopic target selection in the Sloan digital sky survey: The main galaxy sample. Astron J, 2002, 124:1810-1824. 被引量:1

共引文献20

同被引文献85

引证文献8

二级引证文献62

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部