摘要
后缀树是一种非常重要的数据结构,它在与字符串处理相关的各种领域里有着非常广泛的应用。构造后缀树是应用后缀树解决问题的前提和关键。文章首先引入了一种新颖的数据结构——后缀树的相关概念,在此基础上,对其特点和算法的构建进行了论述,并探讨了后缀树及其算法在中文分词、关联分析中的应用,然后以中文文档聚类为例,结合中文需要分词的特点,设计出了基于后缀树聚类算法的聚类系统结构。
Suffix tree is a very important data structure,applied widely in every field. This paper introduces a novel data structuresuffix tree conception, and then discusses its particularity and its arithmetic constructing process, suffix tree and its arithmetic application in Chinese word segmentation and association analyses, at last takes the Chinese document clustering as an example , considering the need of Chinese word segmentation, designs the clustering system' s structure based on suffix tree clustering arithmetic.
出处
《焦作大学学报》
2007年第3期70-72,共3页
Journal of Jiaozuo University
关键词
后缀树
关联分析
聚类
suffix tree
association analysis
clustering