摘要
【目的】针对当前学术问答社区内用户生成内容的自动摘要生成问题,提出改进的自动摘要聚合方法,为社区内的学术用户提供高效、准确的知识聚合服务。【方法】提出改进的W2V-MMR自动摘要生成算法,在词句得分和相似度计算的过程中,利用基于深度学习的Word2Vec词向量生成模型,优化摘要句信息质量;引入最大边界相关(MMR)的思想,对学术问答社区内的用户生成问答文本进行自动摘要。【结果】本文方法在4组实验数据中获取的信息质量评分分别为:1.422 8、1.447 6、1.592 1、3.416 8,均高于对比实验的MMR、TextRank摘要生成方法。【局限】未考虑摘要句数对结果的影响,未对比不同摘要句数下的摘要生成质量。【结论】本文方法可以有效地应用在学术问答社区的知识聚合服务中,为社区内学术用户提供快速获取知识的新途径。
[Objective] Aiming at the knowledge aggregation problem of user-generated content(UGC) in the current academic Q&A community, an improved automatic summarization method was proposed to provide efficient and accurate knowledge aggregation services for scientific research users in the community. [Methods]The proposed method called W2 V-MMR was combine the idea of the Maximal Marginal Relevance(MMR) with the Word2 Vec model. Firstly, information quality of abstract sentences was optimized through Word2 Vec in the process of score and similarity calculation. Then the Maximal Marginal Relevance(MMR) was introduced to extract the abstract of UGC in the academic Q&A community. [Results] The information quality scores obtained by the proposed method in the four groups of experimental data are 1. 422 8, 1. 447 6, 1. 5921 and 3. 416 8, which were all higher than the MMR and TextRank in the comparative experiment. [Limitations] The effect of the number of abstract sentences on the results is not considered, and the quality of abstract under different number of abstract sentences is not compared. [Conclusions] The proposed method provides useful reference for knowledge aggregation service of academic Q&A community.
作者
陶兴
张向先
郭顺利
张莉曼
Tao Xing;Zhang Xiangxian;Guo Shunli;Zhang Liman(School of Management,Jilin University,Changchun 130022,China;School of Communication,Qufu Normal University,Qufu 276826,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2020年第4期109-118,共10页
Data Analysis and Knowledge Discovery
基金
国家社会科学基金项目“大数据驱动下学术新媒体知识聚合及创新服务研究”(项目编号:18BTQ085)的研究成果之一。