摘要
在贝叶斯信念网络的基础上,给出了一个新的动态话题追踪模型作为文章的表示模型。依据时间距离量化动态话题追踪中的时序信息,并将其应用于特征权重的动态调整。考虑到较长时间没有再现的特征权重应该衰减,给出了权重衰减函数,若衰减后的特征权重低于一定的阈值,则将其视为冗余信息。实验采用TDT4测试集合和DET曲线进行评测,通过反复实验获得基于TDT语料的最优时间距离阈值α和决定是否为冗余特征的阈值β。实验证明,使用时序权重后可有效提高动态话题追踪模型的追踪性能。
A new dynamic topic tracking model was proposed based on Bayesian belief network,which is used as the representation model in this paper.We used time distance to quantify temporal information which is then used to dynamically adjust feature weight.A weight decay function was given to deal with the long-time disappearing features.If the weight of a feature is lower than the given threshold after decaying,the feature will be viewed as redundant information.TDT4 corpora and DET curves were used to run experiments.We firstly obtained the optimal time distance thresholdαand the thresholdβto determine whether a feature is redundant information.Experimental results show that the tracking performance of dynamic topic models can be effectively improved by using temporal weight.
出处
《计算机科学》
CSCD
北大核心
2015年第2期233-236,240,共5页
Computer Science
基金
中国博士后科学基金资助项目(20070420700)
河北省自然科学基金资助项目(F2011201146)
河北省科技计划项目(13450337)资助
关键词
话题追踪
时序权重
衰减
贝叶斯信念网络
Topic tracking
Temporal weight
Decay
Bayesian belief network