摘要
基于miRNA表达谱数据集,提出了一种新的数据挖掘算法——tSVM-kNN(t statistic with support vector machine-k nearest neighbor).该算法的思想为:首先,采用统计量法对该数据集进行特征初选;其次,将融合了支持向量机和K-最近邻判别法思想的算法——SVM-kNN算法作为分类器;最后,输出分类结果.仿真实验表明,SVMkNN算法分类器的分类能力比单独运行SVM和kNN都好;在miRNA"标签"的数量和识别精度方面,tSVM-kNN算法只需要取5个miRNAs即可获得96.08%的分类准确率.与同类的算法相比,其具有明显的优越性.
Based on miRNAs expression profiling data sets,new data mining algorithms—tSVM-kNN(t statistic with support vector machine-k nearest neighbor)is proposed.Firstly,an original selection is made to this set by characteristics using t-statistic method.After that,both ideas in support vector machine(SVM)and k nearest neighbor(kNN)algorithms are combined as a classifier,i.e.,SVM-kNN algorithm.Finally,the classification results as outputs can be obtained.Then,simulation experiments show that SVM-kNN algorithm as a classifier can display a stronger ability compared with running SVM and kNN,respectively.As to the aspects of quantity and recognition accuracy with a miRNAs label,tSVM-kNN algorithm only need five miRNAs but can get a precision of 96.08% in classification.Obviously,compared with some existed methods,the proposed algorithm has more advantages.
出处
《西北师范大学学报(自然科学版)》
CAS
北大核心
2016年第2期47-52,共6页
Journal of Northwest Normal University(Natural Science)
基金
广东省自然科学基金资助项目(2015A030310354)
广东省教育厅创新强校工程项目(Q14606)