期刊文献+

基于贝叶斯信息准则的文本主题数估计 被引量:5

Text Topic Number Evaluation Based on Bayes Information Criteria
下载PDF
导出
摘要 特定领域的主题识别和关键词提取有着广泛的应用,但通过人工指定识别或文本聚类自动生成的主题类别缺乏客观的度量方法。该文结合基于BIC准则的模型选择理论和独立分量分析技术对主题的数量进行概率估计,给出主题数量在BIC意义下的统计分布。在此基础上实现了文档矩阵的ICA分解,并根据分离的独立分量获得主题的关键词及其权重。实验表明,该方法在没有领域知识支持的情况下能估计出反映文本集合的主题数并提取相应的关键词。 There are many applications that can benefit from topic identification and keyword extraction. The traditional way of choosing the topic number depends on human labeling or automatic clustering which is immeasurable. This paper utilizes the Bayes lntonnation Criteria(BIC) based model selection theory to evaluate the probability of each topic numbers taking. After the topic number is acquired, the paper implements the Independent Component Analysis(ICA) decomposition of term-document, then calculates the weight and extracts the keyword according to the ICA separating matrix. Experiments show this method extracts the keyword in a meaningful way.
出处 《计算机工程》 CAS CSCD 北大核心 2009年第7期183-185,共3页 Computer Engineering
关键词 主题识别 关键词提取 独立分量分析 贝叶斯信息准则 topic identification keyword extraction Independent Component Analysis(ICA) Bayes Information Criteria(BIC)
  • 相关文献

参考文献5

  • 1Hyvarinen A. Fast and Robust Fixed-point Algorithms for Independent Component Analysis[J]. IEEE Transactions on Neural Networks, 1999, 10(3): 626-634. 被引量:1
  • 2Hansen L K, Larsen J, Kolenda T. Blind Detection of Independent Dynamic Component[C]//Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing. [S. l.]: IEEE Press, 2001: 3197-3200. 被引量:1
  • 3Duda R O,Peter E H,David G S.模式分类[MI.李宏东,姚天翔,译.2版.北京:机械工业出版社,2003:392-394. 被引量:1
  • 4Kolenda T, Hansen L K, Sigurdsson S. Independent Components in Text[M]//Girolami M. Advances in Independent Component Analysis.[S. l.]: Springer-Verlag, 2000: 235-256. 被引量:1
  • 5Mackay D. Maximum Likelihood and Covariant Algorithms for Independent Component Analysis[R]. Cavendish Laboratory, University of Cambridge, Technical Report Draft 3.7, 1996. 被引量:1

同被引文献43

引证文献5

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部