摘要
采用基于词典的正向增字最大匹配算法,分词词典采用改进的双层哈希表加动态数组的数据结构。在不提升已有典型词典机制空间复杂度与维护复杂度的情况下,一定程度上提高了中文分词的速度和效率。
Chinese word segmentation is the key point of Chinese Natural language processing,the words dictionary's data structure directly influences speed and efficiency of the segmentation.In order to enhance the search speed of the dictionary,the algorithm is based on forward maximum match,words dictionary's data structure is Improved Double-Hashtable and dynamic array in this system.It can increase speed and efficiency,but do not enhance the space complexity and maintenance complexity of words dictionary.
出处
《软件导刊》
2010年第10期54-55,共2页
Software Guide
关键词
自然语言处理
中文分词
最大匹配算法
双哈希表
Natural Language Processing
Chinese Word Segmentation
Maximum Matching Word Method
Double-Hash