摘要
主成分分析(Principal Component Analysis,PCA)法用于高速视觉的激光麦克风的音频信号重建,可从声场中轻质弹性物体表面的激光散斑动态变化中提取语音信息。将高速散斑视频中的一帧图像视为高维空间中的向量,顺序将视频图像堆栈成数据矩阵,利用PCA做特征提取,语音信息就存在于方差较大的主成分中,通常应用第一主成分就可以重建清晰的语音信号。实验表明,PCA对激光散斑颗粒尺度和灰度分布没有过多限制,即使在采样区域较小、反射物体材质不同的情况下,都可以重建人耳可分辨的语音信号。而且基于PCA的无监督机器学习算法特性,选取视频开始部分的帧图像做训练集,还可以提取含有音频信息的主成分的特征向量,作为后续视频图像向量的投影基,实现语音信号的快速提取。
Principal component analysis(PCA)is used to reconstruct the audio signal of high-speed vision laser microphone,which can extract voice information from dynamic changes of laser speckle on the surface of light elastic objects in sound field.The speckle image in high-speed video is regarded as a vector in high-dimensional space,sequentially stacking video images into a data matrix,and the features are extracted by PCA.The speech information exists in the principal components with a large variance,and usually the clear speech signal can be reconstructed by first principal component.Experiments show that PCA does not have too many restrictions on the particle size and gray distribution of laser speckle,and can reconstruct speech signals distinguishable to human ears even with small sampling areas and different reflective object materials.Moreover,based on the characteristics of unsupervised machine learning algorithm of PCA,the eigenvectors containing audio information of the main components can also be extracted by using the initial frames of video as the training set,which can be used as the projection base of subsequent video image vectors to realize the rapid extraction of voice signals.
作者
孙学明
张大华
周志全
赵张美
胡荣磊
SUN Xue-ming;ZHANG Da-hua;ZHOU Zhi-quan;ZHAO Zhang-mei;HU Rong-lei(Beijing Electronic Science and Technology Institute,Beijing 100070,China)
出处
《激光与红外》
CAS
CSCD
北大核心
2022年第12期1761-1767,共7页
Laser & Infrared
关键词
激光麦克风
激光散斑
语音提取
主成分分析
机器学习
laser microphone
laser speckle
sound extraction
principal component analysis
machine learning