A SVMs (Support Vector Machines) based method to identify Chinese place names is presented. In our approach, place name candidate is located according to a rational forming assumption, then SVMs based identification s...A SVMs (Support Vector Machines) based method to identify Chinese place names is presented. In our approach, place name candidate is located according to a rational forming assumption, then SVMs based identification strategy is used to distinguish whether one candidate is true place name or not. Referring to linguistic knowledge, basic semanteme of a contextual word and frequency information of words inside place name candidate are selected as features in our methodology. So dimension in the feature space is reduced dramatically and processing procedure is performed more efficiently. Result of open testing on unregistered place names achieves F-measure 83.25 in 8.17 million words news based on this project.展开更多
This paper provides a flexible and efficient method to identify Chinese personal names based on SVM (Support Vector Machines). In its approach, forming rules of personal name is employed to select candidate set, then ...This paper provides a flexible and efficient method to identify Chinese personal names based on SVM (Support Vector Machines). In its approach, forming rules of personal name is employed to select candidate set, then SVM based identification strategies is used to recognize real personal name in the candidate set. Basic semanteme of word in context and frequency information of word inside candidate are selected as features in its methodology, which reduce the feature space scale dramatically and calculate more efficiently. Results of open testing achieved F-measure 90.59% in 2 million words news and F-measure 86.67% in 16.17 million words news based on this project.展开更多
基金Foundation of China(Grant No.60175020and60673037) and the National High Technology Research and Development Program of China (Grant No.2002AA117010-09).
文摘A SVMs (Support Vector Machines) based method to identify Chinese place names is presented. In our approach, place name candidate is located according to a rational forming assumption, then SVMs based identification strategy is used to distinguish whether one candidate is true place name or not. Referring to linguistic knowledge, basic semanteme of a contextual word and frequency information of words inside place name candidate are selected as features in our methodology. So dimension in the feature space is reduced dramatically and processing procedure is performed more efficiently. Result of open testing on unregistered place names achieves F-measure 83.25 in 8.17 million words news based on this project.
文摘This paper provides a flexible and efficient method to identify Chinese personal names based on SVM (Support Vector Machines). In its approach, forming rules of personal name is employed to select candidate set, then SVM based identification strategies is used to recognize real personal name in the candidate set. Basic semanteme of word in context and frequency information of word inside candidate are selected as features in its methodology, which reduce the feature space scale dramatically and calculate more efficiently. Results of open testing achieved F-measure 90.59% in 2 million words news and F-measure 86.67% in 16.17 million words news based on this project.