期刊文献+

采用HDPHMM符号化器的语音查询样例检测方法 被引量:1

Query-by-example Spoken Term Detection by Applying the HDPHMM Tokenizer
下载PDF
导出
摘要 提出一种基于层级狄利克雷过程隐马尔科夫模型(HDPHMM)符号化器的无监督语音查询样例检测(Qb E-STD)方法。该方法首先应用一个双状态层隐马尔科夫模型,其中顶层状态用于表示所发现的声学单元,底层状态用于建模顶层状态的发射概率,通过对顶层状态假设一个层级狄利克雷过程先验,获得非参贝叶斯模型HDPHMM。使用无标注语音数据对该模型进行训练,然后对测试语音和查询样例输出后验概率特征矢量,使用非负矩阵分解算法对后验概率进行优化得到新的特征,然后在此基础上,应用修正分段动态时间规整算法进行检索,构成Qb E-STD系统。实验结果表明,相比于基于高斯混合模型符号化器的基线系统,本文所提出的方法性能更优,检索精度得到显著提升。 This paper presents a study of hierarchical Dirichlet processing hidden Markov model (HDPHMM) approach for unsupervised query-by-example spoken term detection (QbE-STD). First a hierarchical hidden Markov model is applied, in which the top layer states are used for representing the finding acoustic units, bottom layer states are used for modeling the emission probability of top layer states. We can get a nonparametric Bayesian model HDPHMM when imposing a hierarchical Dirichlet processing prior on the top layer states. After the model is trained by unlabeled speech data, it outputs posteriorgram feature vector for test utterance and query term. The posteriorgram feature is optimized by non-negative matrix factorization al- gorithm. Then the detection is performed by modified SDTW algorithm. Experimental results show that the proposed method outperforms the baseline system based on Gaussian mixture model tokenizer, and improve the detection precision obviously.
出处 《信号处理》 CSCD 北大核心 2017年第5期703-710,共8页 Journal of Signal Processing
基金 国家自然科学基金资助项目(61673395 61403415 61302107)
关键词 无监督 语音查询样例检测 层级狄利克雷过程 非负矩阵分解 unsupervised query-by-example spoken term detection hierarchical Dirichlet processing non-negative ma- trix factorization
  • 相关文献

参考文献1

二级参考文献20

  • 1张雄伟,等.现代语音处理技术及应用[M].北京:机械丁业出版社,2009. 被引量:3
  • 2Loizou P C. Speech Enhancement: Theory and Practice [ M ]. Signal Processing and Communications, 2007. 被引量:1
  • 3Boll S. Suppression of acoustic noise in speech using spec- tral subtraction [ J ]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1979, 27(2) :113-120. 被引量:1
  • 4Hung Wei Tseng, Srikanth Vishnubhotla, et al. A novel single channel speech enhancement approach by combi- ning wiener filter and dictionary learning [ C ]//Vancou- ver: Acoustics, Speech and Signal Process ( ICASSP), IEEE, 2013:8653-8657. 被引量:1
  • 5Yu Wang, Mike Brookes. Speech enhancement using a robust kalman filter post-processor in the modulation do- main [ C ]//Vancouver: Acoustics, Speech and Signal Process (ICASSP), IEEE, 2013:7457-7461. 被引量:1
  • 6Borgstrom B J, Alwan A. Log-spectral amplitude estima- tion with generalized Gamma distributions for speech en- hancement [ C ] ///Prague: IEEE Int. Conf. Acoustic, Speech and Signal Process (ICASSP), 2011: 4756-4759. 被引量:1
  • 7Srinivasan S, Samuelsson J, Kleijn W B. Codebook driv- en short term predictor parameter estimation for speech enhancement [ J ]. IEEE Trans. Audio, Speech, and Language Process, 2006, 14 ( 1 ) : 163-176. 被引量:1
  • 8Xu Y, Du J, Dai L R, et al. A regression approach to speech enhancement based on deep neural networks [ J ]. IEEE Transactions on Audio, Speech, and Language Pro- cessing, 2015, 23(1), 7-19. 被引量:1
  • 9Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization [J]. Nature, 1999, 401 (10) :788-791. 被引量:1
  • 10Smaragdis P. Convolutive speech bases and their applica- tion to supervised speech separation [ J ]. IEEE Trans. on Audio, Speech and Language Processing, 2007, 15 (1) :1-12. 被引量:1

共引文献6

同被引文献2

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部