
语音识别准确率与检索性能的关联性研究 被引量:2

Research on the Correlation of Speech Recognition Rate and Retrieval Performance
摘要 对海量语音进行基于内容的检索需要语音识别技术和检索技术的结合。本文通过调节语言模型的途径研究在不同识别率的语音识别文本上进行关键词检索的差异,由此研究语音识别性能和检索性能之间的关联性。通过对114小时语音数据的实验表明:语音识别性能与检索性能有一定的相关性,同时也说明改进检索的方法可以消除一部分由于语音识别所带来的误差。研究结果为进一步针对性地改进识别引擎、语音识别输出的表示和相应的快速检索方法提供了基础。 It is a paradigm to integrate speech recognition and information retrieval techniques to implement contentbased retrieval in mass speech data. The paper studies the relationship between speech recognition performance and retrieval performance, through analyzing the differences of keywords retrieval in the recognition documents with different recognition rates, which are adjusted by the language models. The experiment on 114 hours speech data indicates: speech recognition performance has some correlation with retrieval performance, and illuminates that improving the retrieval method can eliminate some speech recognition errors. The result provides the basis for further advancements in speech recognition engine, speech recognition results representation and rapid retrieval method.
出处 《中文信息学报》 CSCD 北大核心 2006年第3期99-104,共6页 Journal of Chinese Information Processing
基金 国家高科技发展计划"863"资助项目(2005AA114070)
关键词 计算机应用 中文信息处理 语音识别 关键词检索 查全率 查准率 computer application Chinese information processing speech recognition keywords retrieval recall precision
  • 相关文献


  • 1高升,徐波,黄泰翼.基于决策树的汉语三音子模型[J].声学学报,2000,25(6):504-509. 被引量:20
  • 2The TREC NIST site[EB/OL],2005.http://trec.nist.gov. 被引量:1
  • 3Steve Renals,Dave Abberley,David Kirby,Tony Robison.Indexing and retrieval of broadcast news[J].Speech Communication 32 (2000) 5-20. 被引量:1
  • 4Hsin-min Wang.Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese[J].Speech Communication 32 (2000) 49-60. 被引量:1
  • 5Corinna Ng,Ross Wilkingson,Justin Zobel.Experiments in spoken document retrieval using phoneme n-gram[J].Speech Communication 32(2000) 61 -77. 被引量:1
  • 6G.Salton,editor.The SMART Retrieval System-Experiments in Automatic Document Retrieval[M].Prentice Hall Inc.,Englewood Cliffs,NJ,1971. 被引量:1


  • 1林焘 王理嘉.语音学教程[M].北京:北京大学出版社,.. 被引量:4
  • 2徐波 张亮 等.基于决策树方法的语境有关HMM建模.第八届全国声学学术会议[M].,1998.421-424. 被引量:1
  • 3Hwang Meiyuh,IEEE Trans Speech Audio Processing,1998年,4卷,6期,412页 被引量:1
  • 4徐波,第八届全国声学学术会议,1998年,421页 被引量:1
  • 5Ma Bin,ICASSP ’96,USA,1996年 被引量:1
  • 6林杰,语音学教程 被引量:1



  • 1孟莎,余鹏,Frank Seide,刘加.基于后验概率词格的汉语自然对话语音索引[J].清华大学学报(自然科学版),2008,48(S1):673-677. 被引量:2
  • 2M. Saraclar and R. Sproat. Lattice-based Search for Spoken Utterance[C]//Proceeding of Human Language Technology Conference. Boston, 2004: 129-136. 被引量:1
  • 3C. Chelba and A. Acero. Position specific posterior lattices for indexing speech [C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Ann Arbor, 2005: 443-450. 被引量:1
  • 4F. Seide, P. Yu and Y. Shi. Towards Spoken-Document Retrieval for the Enterprise: Approximate Word- Lattice Indexing with Text Indexers [C]//Proceeding of IEEE Workshop on Automatic Speech Recognition and Understanding. Kyoto, 2007: 629-634. 被引量:1
  • 5B. Logan, P. Moreno, J. M. Van Tong et al. An Experimental Study of an Audio Indexing System for the Web [C]//Proceeding of Sixth International Conference on Spoken Language Processing. Beijing, 2000: 676-679. 被引量:1
  • 6K. Ng. Subword-Based Approaches for Spoken Document Retrieval [D]. Ph. D. thesis, Massachusetts In- stitute of Technology, 2000. 被引量:1
  • 7P. Yu and F. Seide. A Hybrid Word/Phoneme-based Approach for Improved Vocabulary-independent Search in Spontaneous Speech [C]//Proceeding of Sixth International Conference on Spoken Language Processing, Korean, 2004: 293-296. 被引量:1
  • 8J. Shao, P Yu, Q. Zhao, Y. Yan. F. Seide. Towards Vocabulary-Independent Speech Indexing for Large-Scale Repositories [C]//Proceeding of Inter- speech. Brisbane, 2008:2150-2153. 被引量:1
  • 9H. M. Wang, H. Meng, P. Schone, B. Chen, W. K. Lo. Multi-Scale Audio Indexing for Translingual Spoken Document Retrieval [C]//Proceedings of IEEE Interna- tional Conference on Acoustics, Speech and Signal Processing. Salt Lake City, 2001: 605-608. 被引量:1
  • 10Y. C. Pan, H. L. Chang, B. Chen and L. S. Lee. Subword-based Position Specific Posterior Lattices (S- PSPL) for Indexing Speech Information [C]//Proceedings of Interspeech. Antwerp, 2007:318-321. 被引量:1










使用帮助 返回顶部