摘要
近年来,随着各大光谱巡天项目的陆续实施,观测得到的天体光谱数据急剧增长。大型光谱巡天项目对光谱的自动分类和分析提出了更高的要求。本文将分类问题转化为回归问题,提出一种基于深度残差网络的光谱类别预测方法,对恒星光谱进行光谱次型预测。网络主要包括25个卷积层,1个最大池化层,1个平均池化层,全连接层以及12个残差结构。最大池化层用来筛选特征,卷积层提取特征,平均池化层用于减少模型参数,提高效率。残差结构可以防止网络退化,加深网络来提取高维抽象特征以及提高训练速度。考虑到数据有非零几率存在错误标签以及损坏数据,采用Log-Cosh作为损失函数来降低坏样本带来的负面影响。实验数据使用的是从LAMOST DR5中随机抽取的80000条光谱,由于光谱质量等原因,每个光谱型的光谱数量不一。经过剔除坏值,流量归一化后,按7∶1∶2分为训练集、验证集和测试集。实验包括两个部分,第一个部分是使用数据集训练网络在光谱次型上进行类别预测,使用最大绝对误差、平均绝对误差以及标准差来比较不同形状卷积核的性能。将预测值作为横坐标,标签作为纵坐标,对测试集所有样本点使用二阶非线性拟合,得到了一条与y=x重合的直线。证明模型可以很好的预测光谱次型。第二部分是对模型进行内部分析,使用类别激活映射的方法分别研究了模型预测A,F,G和K四种类型光谱时所关注的主要特征,赋予了模型可解释性。在文中数据集上,该方法对91.4%的光谱预测误差在0.5个光谱次型以内,预测的平均绝对误差为0.3个光谱次型。并与非参数回归、Adaboost回归树、K-Means三种方法进行同数据集比较,结果表明文中提出的方法可以很好地预测光谱次型并且速度更快,准确率更高。
In recent years,the spectral data of celestial bodies observed have achieved a dramatic increase thanks to the successful implementation of various projects of spectral sky survey.Therefore,higher requirements for the automatic classification and analysis of spectrum are proposed for large-scale projects of spectral sky survey.The classification problem is transformed into a regression one in this paper,and a method of spectral category regression based on the residual depth network is put forward to conduct a prediction of MK spectral subtype on stellar spectrum.The network is mainly composed of 25 convolution layers,1 maximum pooling layer,1 average pooling layer,full connection layer and 12 residual structures.The maximum pooling layer is used to filter features,the convolution layer to extract features,and the average pooling layer to reduce parameters and improve efficiency.The residual structure can prevent the degradation of the network,extract high-dimensional abstract features by deepening the network and improve training speed.Considering the non-zero probability of data with false labels and corrupted data,Log-Cosh is adopted as a loss function in this paper to reduce the negative impact of bad samples.80 000 spectra that are randomly selected from LAMOST DR5 are used as the experimental data.The spectra are divided into the training set,verification set and test set according to the proportion of 7∶1∶2 after eliminating the bad value and normalizing the flow.The experiment includes two parts.In the first part,the network is adopted to carry out a prediction on the spectral subtype,and the maximum absolute error,the average absolute error and the standard deviation are used to compare the performance of convolution kernels with different shapes.The predicted value is taken as the abscissa and the label as the ordinate,and the second-order nonlinear fitting is used for all sample points in the test set,a straight line that is coincident with y=xis obtained,proving that the model can predict the spec
作者
王天翔
范玉峰
王晓丽
龙潜
王传军
WANG Tian-xiang;FAN Yu-feng;WANG Xiao-li;LONG Qian;WANG Chuan-jun(Yunnan Observatories,Chinese Academy of Sciences,Kunming 650011,China;University of Chinese Academy of Sciences,Beijing 100049,China)
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2021年第5期1602-1606,共5页
Spectroscopy and Spectral Analysis
基金
国家自然科学基金项目(11603072,11773074)
云南省科技厅科技入滇项目(202003AD150003)资助。
关键词
恒星光谱
光谱次型预测
深度学习
回归
特征映射
Stellar spectrum
MK classification
Deep learning
Regression
Feature mapping