摘要
KNFL算法是近年来在人脸识别领域提出并广泛应用的分类算法,它认为类空间中两点的连线可以比类内的点更能代表类空间的特征。如果仅依据特征线距离来分类,会造成误分。这里为消除类内离群点对分类的影响提出引入加权系数,并结合类中心距的概念提出改进算法,并将其应用到海量文本分类中去。试验结果证明此改进算法能够提高文本分类精度,很好的降低了分类器对训练规模的要求。
KNFL has been a classification algorithm popular in Face Identification in recent years. It deems that a line between two points in the same type of space represents the feature of the whole space than a single point. However, it brings faults in results in terms of distance only. Here coefficient was put forward to eliminate the influence of the off-group point, which was also combined with the central distance of class, then formed the improved algorithm ,which is used in large quantity of text classification. The results of experiment show that the improved algorithm advances the precision of text classification, and reduces the requirement of training scale.
出处
《微计算机信息》
北大核心
2005年第11S期159-160,163,共3页
Control & Automation
基金
河南省教育厅基金资助项目编号:sp200303099
关键词
K最近特征线
离群点
类中心距
K-Nearest Feature Line, off-group point, central distance of class.