摘要
利用太赫兹时域光谱技术对黄耆类牧草种子样品进行测试,得到5种常见沙打旺牧草种子在0.2~1.2 THz有效频率范围内的太赫兹时域谱,然后通过快速傅里叶变换得到了各牧草种子样品的吸收系数、折射率等光学参数。研究后发现:在有效频率范围内,样品时域谱的峰值强度和延迟时间均不同,且每条谱线的平均吸收系数和标准差也有明显差异,各样品的平均折射率也有较大差异。同时,本文提出了一种将主成分分析(PCA)与随机森林(RF)机器学习算法相结合的优化实验数据的混合模型PCA-RF,并基于太赫兹折射率谱,采用PCA-RF模型和RF模型对5种牧草种子的200个数据集进行了统计计算。结果表明:混合模型PCA-RF的平均分类准确率为91.20%;与RF模型相比,不管是总的平均分类准确率,还是每种样品的分类准确率,PCA-RF模型都优于RF模型。研究结果表明,太赫兹时域光谱技术结合混合机器学习算法的PCA-RF模型是一种无损鉴定牧草种子真伪的有效手段,可用于鉴别同族且差异较小的牧草品种。
In this study,the terahertz time-domain spectroscopy(THz-TDS)technology was used to conduct experimental tests on seed samples of astragalus japonica.We obtained the terahertz time-domain spectra of five kinds of Astragalus adsurgens Pall.seeds in the effective frequency range of 0.2-1.2 THz,and used the fast Fourier transform analysis to study the optical parameters such as the absorption coefficient and refractive index of each grass-seed sample.It was found that in the effective frequency range,the peak intensity and delay time of the time-domain spectrum of the samples were different,and the average absorption coefficient and standard deviation of each spectrum line were significantly different.In addition,the average refractive index of the samples was significantly different.At the same time,this study proposes a hybrid model of optimized experimental data that combines principal component analysis(PCA)with random forest machine learning(RF).Based on the terahertz refractive index spectrum,200 datasets of five forage species were statistically calculated,and the calculated results were compared with the calculated results of the RF model.The results show that the average classification accuracy of principal component analysis-random forest(PCA-RF)in the mixed model is 91.20%.Compared with the RF model,both total average classification accuracy and the classification accuracy of each sample of the PCA-RF model are better than those of the RF model.The study shows that the PCA-RF model combining THz-TDS with the hybrid machine learning algorithm can be used as an effective method for the nondestructive identification of the authenticity of forage grass seeds.In particular,it can be used for the classification of forage grass varieties of the same family with little difference.
作者
王芳
张春红
赵景峰
哈斯巴特尔
张玉
Wang Fang;Zhang Chunhong;Zhao Jingfeng;Sibateer Ha;Zhang Yu(College of Science,China University of Petroleum,Beijing 102249,China;Inner Mongolia Grassland Station,Huhhot,Inner Mongolia 010020,China)
出处
《激光与光电子学进展》
CSCD
北大核心
2021年第3期310-316,共7页
Laser & Optoelectronics Progress
基金
内蒙古自治区科技计划(201602054)。
关键词
光谱学
太赫兹时域光谱
主成分分析
随机森林
定性鉴别
定量分析
spectroscopy
terahertz time-domain spectroscopy
principal component analysis
random forest
qualitative identification
quantitative analysis