期刊文献+

从高频词等级相关角度探析《红楼梦》作者 被引量:7

Author Identification of The Dream of Red Mansions Based on the Rank Correlation of the High Frequency Words
下载PDF
导出
摘要 该文提出一种"基于高频词等级相关度的方法"来探析存疑文献的作者信息,把各份语料中的词型均按照出现频次递减排列并确定等级,然后通过计算出语料之间高频词等级的相关度,来推断语料之间语言风格的相似度,并且把这种方法与"基于词型共现率的方法"和"基于词例共现率的方法"相比较。把《红楼梦》的120回均分为12份语料,使用"基于高频词等级相关度的方法"计算这12份语料两两之间的相关度。研究发现《红楼梦》的前8份语料两两之间相关度高,后4份语料两两之间相关度也高,而前8份语料与后4份语料这两部分语料之间相关度低。推断《红楼梦》前80回应是同一人所写,后40回应是另一人所写。 This paper puts forward an author identification method based on rank correlation of high frequency word types.Words in each corpus are arranged according to the frequency of occurrence and the rank is determined,then the correlation degree between the high frequency word types among the corpus is calculated,which is applied as the similarity of the language style between corpus.This method is compared the word intersection based method and token intersection based method on 12 sub-divisions of total 120 chapters fromThe dream of Red mansions.It is revealed that the correlation is rather high either between the former 8 corpus or between the latter 4 corpus,while the correlation significantly decreases between the former and the latter chapters.It is inferred that the former 80 chapters of The dream of Red mansions were written by one author,and the latter 40 chapters by another one.
作者 马创新 陈小荷 MA Chuangxin;CHEN Xiaohe(Linguistic Sciences and Arts School of Jiangsu Normal University,Xuzhou,Jiangsu 221009,China;College of Liberal Arts,Nanjing Normal University,Nanjing,Jiangsu 210097,China)
出处 《中文信息学报》 CSCD 北大核心 2018年第11期97-102,共6页 Journal of Chinese Information Processing
基金 江苏省社会科学基金(15YYC001)
关键词 高频词 等级 相关度 作者信息 high frequency word types rank correlation author identification
  • 相关文献

参考文献18

二级参考文献90

共引文献252

同被引文献76

引证文献7

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部