摘要
以《尼山萨满》为例,利用语料库的技术手段保护和传承少数民族文化典籍。构建了一个满族典籍的平行语料库系统原型,重点研究了该平行语料库的语料对齐方法,分别研究了两种段落对齐方法和三种句子对齐方法,并对各个方法进行了性能评价,最终选出最适合该平行语料库的对齐方法。测试结果表明,最终选取的对齐方法满足构建满族典籍平行语料库的需求,从而为其他同类型的少数民族语料库构建提供更多的参考。
Taking Nishan Shaman as an example,we attempt to inherit and protect ethnic minority cultural classics with the help of modern technology. A prototype of Manchu classics parallel corpus is constructed on the basis of Nishan Shaman. And this paper focuses on corpus alignment of ethnic classics parallel corpus,where two kinds of paragraph alignment mothods and three kinds of sentence alignment methods are studied and functionally evaluated. The best alignment methods for this corpus are eventually selected. The assessment result shows that the alignment methods of this study are quite effective and efficient on constructing the Manchu classics parallel corpus,which will provide a reference for the construction of similar ethnic classics parallel corpus.
作者
田春燕
徐毅
解威
郭淑云
TIAN Chun-yan1, XU Yi2, XIE Wei2, GUO Shu-yun 3(1.School of Foreign languages; 2.School of Science; 3. Research Institute of Ethnic Groups inNortheast China,Dalian Minzu University, Dalian Liaoning 116605, Chin)
出处
《大连民族大学学报》
2018年第3期264-268,共5页
Journal of Dalian Minzu University
基金
中央高校基本科研业务费专项资金资助项目(20150415)
关键词
典籍语料库
尼山萨满
段落对齐
句子对齐
classics corpus
Nishan Shaman
paragraph alignment
sentence alignment