摘要
A SVMs (Support Vector Machines) based method to identify Chinese place names is presented. In our approach, place name candidate is located according to a rational forming assumption, then SVMs based identification strategy is used to distinguish whether one candidate is true place name or not. Referring to linguistic knowledge, basic semanteme of a contextual word and frequency information of words inside place name candidate are selected as features in our methodology. So dimension in the feature space is reduced dramatically and processing procedure is performed more efficiently. Result of open testing on unregistered place names achieves F-measure 83.25 in 8.17 million words news based on this project.
A SVMs (Support Vector Machines) based method to identify Chinese place names is presented. In our approach, place name candidate is located according to a rational forming assumption, then SVMs based identification strategy is used to distinguish whether one candidate is true place name or not. Referring to linguistic knowledge, basic semanteme of a contextual word and frequency information of words inside place name candidate are selected as features in our methodology. So dimension in the feature space is reduced dramatically and processing procedure is performed more efficiently. Result of open testing on unregistered place names achieves F-measure 83.25 in 8. 17 million words news based on this project.
基金
Foundation of China(Grant No.60175020and60673037) and the National High Technology Research and Development Program of China (Grant No.2002AA117010-09).