期刊文献+

面向藏文信息处理的藏语短语分类体系研究

Research on Tibetan Phrase Classification System for Tibetan Information Processing
下载PDF
导出
摘要 藏语短语分类体系研究是藏语语言信息处理的重要组成部分,是关键的技术难题。该项技术研究将直接运用于藏文通用大型语料库的建设,在藏文文字识别、自动分词、自动校对、信息检索、文本分类、机器翻译等方面有重要的应用价值,是未来藏文信息传播、交换、藏语智能化研究的动力和基础。藏语短语是藏语语法的一个重要特征和主要内容,藏语短语也像其它语言一样具有一定的语法规则,由实词和虚词搭配而成。藏语短语中词与词的关系、词与虚词的关系是藏语短语研究的重点,也是藏语短语结构中值得关注的研究方向之一。藏语短语分类体系的研究是自然语言处理的基础性任务之一,是近年来研究者持续关注的重要研究课题。该文在从大型藏语语料库中抽取大量的藏语短语的基础上,对其内部结构、语法功能等进行深入的分析,参考语言学文献中藏语短语的分类体系,遵循计算机便于自动分析和处理的原则对藏语短语进行了分类,并规定了信息处理中藏语短语类别单位的标记代码。 The study of Tibetan phrase classification system is an important component of Tibetan language information processing and a key technical challenge.This technological research will be directly applied to the construction of a large-scale Tibetan universal corpus,with important application value in Tibetan text recognition,automatic word segmentation,automatic proofreading,information retrieval,text classification,machine translation,etc.It is the driving force and foundation for future Tibetan information dissemination,exchange,and Tibetan intelligence research.Tibetan phrases are an important feature and main content of Tibetan grammar,and like other languages,Tibetan phrases also have certain grammar rules,formed by the combination of content words and function words.The relationship between words and the relationship between words and function words in Tibetan phrases is a focus of research on Tibetan phrases,and it is also one of the research directions worth paying attention to in Tibetan phrase structure.The study of the Tibetan phrase classification system is one of the fundamental tasks in natural language processing and has been an important research topic that researchers have been continuously paying attention to in recent years.On the basis of in-depth analysis of the internal structure,grammatical function,and other aspects of Tibetan phrases extracted from a large-scale Tibetan corpus,this article refers to the classification system of Tibetan phrases in linguistic literature,classifies Tibetan phrases according to the principle of easy automatic analysis and processing by computers,and specifies the marking codes for the category units of Tibetan phrases in information processing.
作者 才藏太 索南才让 Cai Zang Tai;Suo Nan Cai Rang(State Key Laboratory of Tibetan Intelligent Information Processing and Application of Qinghai Normal University,Xining,Qinghai 810008)
出处 《青海民族大学学报(藏文版)》 CSSCI 2023年第3期99-110,共12页 Journal of qinghai minzu University:Tibetan Version
关键词 藏文信息 短语 分类 Tibetan Information phrase classification
  • 相关文献

参考文献3

二级参考文献8

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部