摘要
首先介绍了几种常见的特征选择和特征抽取方法,并结合K-近邻分类算法对4种特征选择方法进行了分类测试,同时通过测试分析,提出了一些改进的、可行的互信息评价函数.
This paper first introduces five methods of feature selection and feature extraction. Second, K-nearest neighbor is selected as an evaluating classifier to compare the performance of the four feature selection methods in TC. From the test result, a new improved method of FS is presented based on mutual information. The experiment results show that it is effective.
出处
《海南大学学报(自然科学版)》
CAS
2007年第1期62-66,共5页
Natural Science Journal of Hainan University
关键词
文本分类
特征降维
特征选择
互信息
text categorization
feature reduction
features selection mutual information