随着远程监控和人工智能的融合发展,传统的码率优化算法并不适用于现阶段的移动监控网络场景。在机器视觉应用场景中,相对于传统码率优化算法只关注视频的质量,机器更关注于视频所表达的语义信息。以5G路侧摄像头远程智能检测为应用场景...随着远程监控和人工智能的融合发展,传统的码率优化算法并不适用于现阶段的移动监控网络场景。在机器视觉应用场景中,相对于传统码率优化算法只关注视频的质量,机器更关注于视频所表达的语义信息。以5G路侧摄像头远程智能检测为应用场景,提出一种基于视频语义的码率优化算法,在有限的码率传输范围内最大化目标检测准确率。具体地,该算法引入视频语义任务模型,将目标检测作为语义任务。分析目标比特与语义之间的特征关系,建立复杂度与运动区域结合的新权重来分配目标比特,使目标检测准确率达到最大化。实验结果表明,相较于HM16.23所使用的帧级树编码单元(Coding Tree Unit, CTU)层码率控制算法,所提算法不仅能够节省码率而且更符合无线远程监控的目标检测需求。在测试环境下平均提升了1.4%的目标检测准确率,最高能够提升2.5%的目标检测准确率。展开更多
Multimedia document annotation is used in traditional multimedia databasesystems. However, without the help of human beings, it is very difficult to extract the semanticcontent of multimedia automatically. On the othe...Multimedia document annotation is used in traditional multimedia databasesystems. However, without the help of human beings, it is very difficult to extract the semanticcontent of multimedia automatically. On the other hand, it is a tedious job to annotate multimediadocuments in large databases one by one manually. This paper first introduces a method to constructa semantic net-work on top of a multimedia database. Second, a useful and efficient annotationstrategy is presented based on the framework to obtain an accurate and rapid annotation of anymultimedia databases. Third, two methods of joint similarity measures for semantic and low-levelfeatures are evaluated .展开更多
Video event detection is an important research area nowadays.Modeling the video event is a key problem in video event detection.In this paper,we combine dynamic description logic with linear time temporal logic to bui...Video event detection is an important research area nowadays.Modeling the video event is a key problem in video event detection.In this paper,we combine dynamic description logic with linear time temporal logic to build a logic system for video event detection.The proposed logic system is named as LTD_(ALCO)which can represent and inference the static,dynamic and temporal knowledge in one uniform logic system.Based on the LTD_(ALCO),a framework for video event detection is proposed.The video event detection framework can automatically obtain the logic description of video content with the help of ontology-based computer vision techniques and detect the specified video event based on satisfiability checking on LTD_(ALCO)formulas.展开更多
A new video watermarking method for the Audio Video coding Standard (AVS) is proposed. According to human visual masking properties, this method determines the region of interest for watermark embedding by analyzing v...A new video watermarking method for the Audio Video coding Standard (AVS) is proposed. According to human visual masking properties, this method determines the region of interest for watermark embedding by analyzing video semantics, and generates dynamic robust watermark according to video motion semantics, and embeds watermarks in the Intermediate Frequency (IF) Discrete Cosine Transform (DCT) coefficients of the luminance sub-block prediction residual in the region of interest. This method controls watermark embedding strength adaptively by video textures semantics. Ex- periments show that this method is robust not only to various conventional attacks, but also to re-frame, frame cropping, frame deletion and other video-specific attacks.展开更多
文摘随着远程监控和人工智能的融合发展,传统的码率优化算法并不适用于现阶段的移动监控网络场景。在机器视觉应用场景中,相对于传统码率优化算法只关注视频的质量,机器更关注于视频所表达的语义信息。以5G路侧摄像头远程智能检测为应用场景,提出一种基于视频语义的码率优化算法,在有限的码率传输范围内最大化目标检测准确率。具体地,该算法引入视频语义任务模型,将目标检测作为语义任务。分析目标比特与语义之间的特征关系,建立复杂度与运动区域结合的新权重来分配目标比特,使目标检测准确率达到最大化。实验结果表明,相较于HM16.23所使用的帧级树编码单元(Coding Tree Unit, CTU)层码率控制算法,所提算法不仅能够节省码率而且更符合无线远程监控的目标检测需求。在测试环境下平均提升了1.4%的目标检测准确率,最高能够提升2.5%的目标检测准确率。
文摘Multimedia document annotation is used in traditional multimedia databasesystems. However, without the help of human beings, it is very difficult to extract the semanticcontent of multimedia automatically. On the other hand, it is a tedious job to annotate multimediadocuments in large databases one by one manually. This paper first introduces a method to constructa semantic net-work on top of a multimedia database. Second, a useful and efficient annotationstrategy is presented based on the framework to obtain an accurate and rapid annotation of anymultimedia databases. Third, two methods of joint similarity measures for semantic and low-levelfeatures are evaluated .
基金This work was supported by the National Natural Science Foundation of China(Grant Nos.60933004,60903141,60903079,60775030 and 60775035)the National Basic Research Program of China(No.2007CB311004)+1 种基金National High Technology Research and Development Program of China(No.2007AA01Z132)the National Science and Technology Pillar Program(No.2006BAC08B06).
文摘Video event detection is an important research area nowadays.Modeling the video event is a key problem in video event detection.In this paper,we combine dynamic description logic with linear time temporal logic to build a logic system for video event detection.The proposed logic system is named as LTD_(ALCO)which can represent and inference the static,dynamic and temporal knowledge in one uniform logic system.Based on the LTD_(ALCO),a framework for video event detection is proposed.The video event detection framework can automatically obtain the logic description of video content with the help of ontology-based computer vision techniques and detect the specified video event based on satisfiability checking on LTD_(ALCO)formulas.
基金Supported by the Natural Science Foundation of Shaanxi Province (SJ08F15)the Industry Tackling Project of Shaanxi Province (2010K06-20)the National Natural Science Foundation of China and Civil Aviation Ad-ministration of China (No. 61072110)
文摘A new video watermarking method for the Audio Video coding Standard (AVS) is proposed. According to human visual masking properties, this method determines the region of interest for watermark embedding by analyzing video semantics, and generates dynamic robust watermark according to video motion semantics, and embeds watermarks in the Intermediate Frequency (IF) Discrete Cosine Transform (DCT) coefficients of the luminance sub-block prediction residual in the region of interest. This method controls watermark embedding strength adaptively by video textures semantics. Ex- periments show that this method is robust not only to various conventional attacks, but also to re-frame, frame cropping, frame deletion and other video-specific attacks.