摘要
烷烃类气体的傅里叶变换红外光谱在中红外区域吸收峰重叠严重,为此,提出了一种基于变量影响值与集群分析相结合(IVPA)的波长选择方法对甲烷、乙烷、丙烷、异丁烷、正丁烷五种烷烃类气体红外光谱进行变量选择。该方法以迭代的方式逐步实现对变量的筛选,在每次迭代过程中,将变量划分为样本空间与变量空间。在样本空间中计算变量的影响值,根据变量影响值采用加权自举采样技术将变量划分为精英变量与普通变量;同时在变量空间中,统计每个变量在最优模型中出现的频率;最后利用指数衰减函数剔除普通变量中频率较低的变量,记录每次迭代过程中获取的均方根误差(RMSE)值。选择最小RMSE所对应的子集作为最终选择的变量。利用实测烷烃类光谱数据集来检验该方法的性能,并将该方法与近年来提出的稳定性竞争自适应重加权采样法(SCARS)、变量子集迭代优化(IVSO)变量选择方法所测结果进行了对比。以异丁烷分析结果为例,SCARS,IVSO与IVPA对其它四种气体的最小交叉灵敏度分别为0.67%,0.56%和0.11%;最大交叉灵敏度分别为1.69%,1.49%和1.02%;对异丁烷预测的相对误差分别为1.94%,1.65%和0.51%;上述3种方法选择的特征变量个数分别为52,17和13。结果表明,提出的IVPA方法选择的变量最少,仅为原始光谱数据的0.36%,对其他四种气体的交叉灵敏度最低,对异丁烷的预测最准确。该方法可以应用在吸收重叠的光谱中,能够提高分析模型的预测精度与运行效率。
The Fourier transform infrared spectra absorption peaks of alkane gases are overlapping seriously in the mid-infrared region.A wavelength selection method based on the impact value of variables and population analysis(IVPA)is proposed to select the wavelength of five alkane gases infrared spectra composed with methane,eth,propane,iso-butane and n-butane.IVPA algorithm will go through a number of iterations to select variables.In each iteration,the variables are divided into sample space and variable space.The impact value of variables is calculated in the sample space.According to the impact value,the variables are divided into elite variables and normal variables by using the weighted bootstrap sampling technology.Meanwhile,in the variable space,the frequency of each variable in the optimal model is counted.Finally,the variables with a lower frequency of normal variables are eliminated by the exponential decay function,and the root means squared error(RMSE)value obtained during each iteration is recorded.The variable subset corresponding to the minimum RMSE as the final selected variable.The proposed algorithm is tested by alkane dataset,and the results are compared with stability competitive adaptive reweighted sampling(SCARS)and iteratively variable subset optimization(IVSO)variable selection method proposed in recent years.Taking iso-butane analysis results as an example,the minimum cross-sensitivity of IVSO,IVPA and IVPA to the other four gases was 0.67%,0.56%and 0.11%,respectively.The maximum cross sensitivity was 1.69%,1.49%and 1.02%,respectively.The relative errors of iso-butane prediction were 1.94%,1.65%and 0.51%,respectively.The number of selected variables by the above three methods is 52,17 and 13,respectively.The results show that the IVPA method selected the least variables,only 0.36%of the original spectral data,obtained the lowest cross sensitivity for the other four gases,and got the most accurate prediction for iso-butane,which shows that the proposed wavelength selection method can be applied
作者
张峰
汤晓君
仝昂鑫
王斌
汤春瑞
王杰
ZHANG Feng;TANG Xiao-jun;TONG Ang-xin;WANG Bin;TANG Chun-rui;WANG Jie(State Key Laboratory of Electrical Insulation and Power Equipment,Xi’an Jiaotong University,Xi’an 710049,China;CCTEG Chongqing Engineering(Group)Co.,Ltd.,Chongqing 400042,China)
出处
《光谱学与光谱分析》
SCIE
EI
CAS
CSCD
北大核心
2021年第6期1795-1799,共5页
Spectroscopy and Spectral Analysis
基金
国家重点研发计划项目(2016YFF0102805)资助。
关键词
变量选择
变量影响值
加权自举采样
傅里叶变换红外光谱
偏最小二乘
Variableselection
Impact value of variable
Weighted bootstrap sampling
Fourier transform infrared spectrum
Partial least squares