Data classification is an important data mining role in biomedicine.This paper proposes a method to analyze Colorectal Carcinoma diagnosis data based on counting KNN algorithm after analyzing the characteristics of biomedicine data.Though the count-weight-k-nearest neighbours for classification is simple and effective,it doesn't deal with biomedicine data well.After analyzing the algorithm performance,an novel counting KNN algorithm by index tree and sample density is presented.The new method improves the accuracy of classification by using different algorithms of overall density and K-local density,and also improves efficiency by using a tree structure index.Experimefits show that this method outperforms the distance-based voting KNN, and CwKNN.More importantly it is a single method that works for ordinal,nominal or mixed data.
Computer Engineering and Applications
国家自然科学基金(the National Natural Science Foundation of China under Grant No.60776834)
湖南省自然科学基金(the Natural Science Foundation of Hunan Province of China under Grant No.06JJ50143)