摘要
自进入21世纪起,社交媒体与人工智能技术蓬勃发展,声像情报迎来了新一轮升级跨越的历史机遇期,但也面临着前所未有的严峻挑战。后疫情时代,以线上与线下会议互为补充的研讨场景类视音频数据量增长迅猛,其主题一般为重点理论研究讨论或前沿技术交流,具有极高的情报价值,但传统人工方式已经不能满足快速发现声像情报的需求。本文提出利用计算机视觉技术从图像维度对研讨场景视频进行情报挖掘,开展研讨场景梳理与分析,重点研究研讨场景人脸识别、自然场景文本检测与识别和人与文本潜在关联关系发现等内容。此外,基于微服务框架开发了研讨场景视频的图像情报挖掘分析流程,对上述研究内容进行了技术验证,取得了较好的效果。
Social media and artificial intelligence technologies have thrived since the beginning of the 21st century.Audio-visual intelligence has actuated a new round of historical opportunities for upgrading but is also facing unprecedented severe challenges.In the post-pandemic era,video and audio data from online and offline conferences have proliferated,with themes of critical theoretical research discussions or exchanges of cutting-edge technologies with extremely high intelligence value.Traditional manual methods no longer meet the needs for rapid mining of audio-visual intelligence.We propose using computer vision to mine intelligence in videos from the perspective of images and sorting and analyzing discussion scenes with a focus on face recognition,text detection and recognition in natural scenes,and discovering potential relationships between humans and texts.Based on a micro-service framework,the visual intelligence mining and analysis process for videos of discussion scenes is developed.The study above is technically verified and achieved good performance.
作者
吴叔義
郭秀峰
侯丽
WU Shuyi;GUO Xiufeng;HOU Li(Center for Information Research,Academy of Military Sciences,Beijing 100142,China)
出处
《国防科技》
2022年第4期131-136,共6页
National Defense Technology
关键词
声像情报
研讨场景视频
人脸识别
自然场景文本检测与识别
audio-visual intelligence
video of discussion scenes
face recognition
text detection and recognition in natural scenes