摘要
主要利用了SVM统计机器学习模型对中国现当代文学八位代表人物的作品进行了作者身份识别研究,在识别过程中选取了以词汇为基础的多种统计量作为识别特征,并且采取了基于低密度多特征的训练方法,在跨文体的作品的作者身份识别中取得了非常优异的识别性能。
This paper uses the statistical model (SVM) for the identification of the author of contemporary Chinese literature works to eight representatives.In the identification process to select a vocabulary based on a variety of statistics as identifying features,and to take training methods based on the low-density and more features,having achieved better result in cross-style works of the author identification.
出处
《计算机工程与应用》
CSCD
北大核心
2010年第4期226-229,共4页
Computer Engineering and Applications
基金
国家社会科学基金项目 No.07BYY050~~
关键词
作者身份识别
机器学习
计算风格学
现当代文学
authorship attribution
machine learning
computational stylistics
contemporary literature