期刊文献+

视频实时评论的深度语义表征方法 被引量:6

Deep Semantic Representation of Time-Sync Comments for Videos
下载PDF
导出
摘要 随着互联网技术的进步,以视频实时评论为代表的众包短文本(又称弹幕)逐渐流行,对在线媒体分享平台和娱乐产业都带来了重要影响.针对此类短文本展开研究,为推荐系统以及人工智能等领域的发展提供了新的机遇,在各行各业都具有巨大价值.然而在弹幕带来机遇的同时,理解和分析这种面向视频的众包短文本也面临诸多挑战:视频实时评论的高噪声、不规范表达和隐含语义等特性,使得传统自然语言处理(natural language processing, NLP)技术具有很大局限性,因此亟需一种容错性强、能刻画短文本深度语义的理解方法.针对以上挑战,在"相近时间段内的视频实时评论具有相似语义"假设的基础上,提出了一种基于循环神经网络(recurrent neural network, RNN)的深度语义表征模型.该模型由于引入了字符级别的循环神经网络,避免了弹幕噪声对文本分词带来的影响.通过使用神经网络,使所得的语义向量能够表达弹幕的隐含语义.在此基础上,进一步设计了基于语义检索的弹幕解释框架,同时作为对语义表征结果的应用验证.最后,设计了多种对比方法,并采用不同指标对所提出的模型进行充分的验证.该模型能够精准地刻画弹幕短文本的语义,也证明了关于弹幕相关假设的合理性. With the development of Internet,crowdsourcing short texts such as time-sync comments for videos are of significant importance for online media sharing platforms and leisure industry.It also provides a new research opportunity for the evolution of recommender system,artificial intelligence and so on,which have tremendous values for every walk of life.At the same time,there are many challenges for crowdsourcing short text analysis,because of its high noise,non-standard expressions and latent semantic implication.These have limited the application of traditional natural language processing(NLP)techniques,thus it needs a novel short text understanding method which is of high fault tolerance,and can capture the deep semantics.To this end,this paper proposes a deep semantic representation model based on recurrent neural network(RNN).It can avoid the effect of noise on text segmentation by exploiting the character-based RNN.To achieve the semantic representation,we apply the neural network to represent the latent semantics such that the outputted semantic vectors can deeply reflect the time-sync comments.Then we further design a time-sync comment explanation framework based on semantic retrieval,used for the validation of semantic representation.Finally,we compare them with others baselines,and apply many measures to validate the proposed model.The experimental results show that model can capture the semantics in these short texts more precisely,and the assumptions related to time-sync comments are reasonable.
作者 吴法民 吕广奕 刘淇 何明 常标 何伟栋 钟辉 张乐 Wu Famin;Lü Guangyi;Liu Qi;He Ming;Chang Biao;He Weidong;Zhong Hui;Zhang Le(School of Softore Engineering,University of Science and Technology of China,Hefei 230051;Anhui Province Key Laboratory of Big Data Analysis and Application School of Computer Science,University of Science and Technology of China,Hefii 230027)
出处 《计算机研究与发展》 EI CSCD 北大核心 2019年第2期293-305,共13页 Journal of Computer Research and Development
基金 国家重点研发计划项目(2016YFB1000904) 国家自然科学基金项目(61672483 U1605251) 中国科学院青年创新促进会会员专项基金项目(2014299)~~
关键词 视频实时评论 弹幕 深度语义表征 语义检索 字符级循环神经网络 time-sync comment for videos bullet-screen deep semantic representation semantic retrieval character-based recurrent neural network
  • 相关文献

参考文献5

二级参考文献46

  • 1P Indyk, R Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality[C]. Proceedings of the thirtieth annual ACM symposium on theory of computing. ACM, 1998:604-613. 被引量:1
  • 2A Gionis, P Indyk, R Motwani. Similarity search dimensions via hashing[J]. VLDB, 1999:518-529. 被引量:1
  • 3M Datar, N Immorlica, P Indyk, et al. Locality-sensitive hashing scheme based on p-stable distributions[C] Proceedings of the twentieth annual symposium on computational geometry. ACM, 2004:253-262. 被引量:1
  • 4Q Lv, W Josephson, Z Wang, et al. Multi-probe LSH efficient indexing for high-dimensional similarity search[C]. Proceedings of the 33rd international conference on very large data bases. VLDB Endowment, 2007: 950-961. 被引量:1
  • 5A Joly and O Buisson. A posteriori multi-probe locality sensitive hashing[C]. Proceedings of the 16th ACM international conference on multimedia. ACM, 2008: 209- 218. 被引量:1
  • 6M S Charikar. Similarity estimation techniques from rounding algorithms[C], in Proceedings of the thirty- fourth annual ACM symposium on theory of computing. ACM, 2002: 380-388. 被引量:1
  • 7B Kulis, K Grauman. Kernelized locality-sensitive hashing for scalable image search[C]. Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009: 2130-2137. 被引量:1
  • 8H Xia, P Wu, S C Hoi, et al. Boosting multi-kernel locality-sensitive hashing for scalable image retrieval[C]. Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. ACM, 2012:55-64. 被引量:1
  • 9J Gan, J Feng, Q Fang, et al. Locality-sensitive hashing scheme based on dynamic collision counting[C]Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM, 2012: 541- 552. 被引量:1
  • 10Y Tao, K Yi, C Sheng, et al. Quality and efficiency in high dimensional nearest neighbor search[C] Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, 2009: 563- 576. 被引量:1

共引文献71

同被引文献38

引证文献6

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部