摘要
提出了一种基于视音频特征和文本信息的新的场景自动分割技术。其基本思想是先探测新闻视频的镜头边界,再用文本检测方法识别主题字幕帧以得到分割信息。用短时能量和短时平均零交叉率参数探测静音分片。将视音频特征和文本信息相结合以实现自动场景分割。实验使用135 400 帧的测试数据达到了85.8%的准确率和97.5%的重现率。实验结果表明此方法是有效的、稳健的。
A novel news story automatic segmentation scheme based on audio-visual features and text information is presented. The basic idea is to detect the shot boundaries for news video first, then the topic-caption frames are identified to get segmentation cues using text detection algorithm. In the next step, silence clips are detected using short-time energy and short-time average zero-crossing rate (ZCR) parameters. At last, audio-visual features and text information are integrated to realize automatic story segmentation. On test data with 135400 frames, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.
出处
《计算机工程》
EI
CAS
CSCD
北大核心
2005年第6期171-172,199,共3页
Computer Engineering
关键词
新闻视频
场景分割
视音频特征分析
文本检测
News video
Story segmentation
Audio-visual features analysis
Text detection