摘要
基于支持向量机SVM的中文文本分类方法的泛化能力与其参数选取紧密相关,参数优化对文本分类精度有较大影响。为解决优化SVM参数难题,提出一种基于模拟退火(SA)优化SVM的文本分类方法。将文本分类准确率作为模拟退火的优化目标,利用SA良好的寻优能力搜索SVM的最优参数组合。在相同的数据集上进行实验,结果表明模拟退火具有稳定的全局搜索性能,是优化SVM参数的一种有效方式。相比其他文本分类算法,基于SA-SVM的中文文本分类的分类准确率更高,泛化能力更强,具有良好的分类性能。
The generalization ability of Chinese text categorization method based on SVM is closely related to its parameter selection, and parameter optimization has a great impact on the accuracy of text categorization. To solve the problem of optimizing SVM parameters, we proposed a text categorization method based on simulated annealing(SA). The accuracy of text categorization was taken as the optimization objective of SA, and the optimal parameter combination of SVM was searched by SA s good optimization ability. Experiments on the same data set show that SA has stable global search performance and is an effective way to optimize the parameters of SVM. Compared with other text categorization algorithms, the Chinese text categorization method based on SA-SVM has higher classification accuracy, stronger generalization ability and better classification performance.
作者
郭超磊
陈军华
Guo Chaolei;Chen Junhua(The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 201400, China)
出处
《计算机应用与软件》
北大核心
2019年第3期277-281,共5页
Computer Applications and Software
基金
上海师范大学基金项目(C-6105-15-057)
关键词
中文文本分类
支持向量机
模拟退火
参数优化
Chinese text categorization
SVM Simulated annealing algorithm
Parameter optimization