摘要
基因识别是生物信息学研究的一个分支.多元统计中的判别分析方法模型简单、便于解释,处理剪切位点的识别问题效果良好,但极易受到异常值的影响.对于传统判别分析方法,使用稳健统计量进行优化,得到较好的效果,并通过加权方法进一步提高了判别分析方法的稳健性,取得了更好的识别效果.加权稳健判别分析方法稳健性高、受离群值影响小,对其他分类判别问题具有很好的实际意义和参考价值.
Gene recognition is a branch of Biogenetics.Discriminant analysis method,which is involved in multivariate statistical analysis,is of simple style and easy to interpret.However,it is influenced by outliers remarkably.In this essay,we plug robust statistics,such as M estimators,MVE estimators and MCD estimators,into classical discriminant analysis,and better results are attained.In addition,we improved this by adding a weighting process,and the robustness and efficiency are enhanced to a certain extent.The weighted robust discriminant analysis method performs well,and has practical and referential implications for other related problems.
作者
师玥
金蛟
SHI Yue JIN Jiao(School of Mathematical Sciences of Key Laboratory of Mathematics and Complex Systems, Ministry of Education, Beijing Normal University, Beijing 100875, China School of Statistics, Beijing Normal University, Beijing 100875, China)
出处
《数学的实践与认识》
北大核心
2016年第22期202-208,共7页
Mathematics in Practice and Theory
基金
教育部人文社会科学研究青年基金资助项目(15YJC910003)
中央高校基本科研业务专项资金资助项目
国家自然科学基金(10901020)
关键词
基因识别
剪切位点
异常值
加权稳健判别分析
Gene recognition
Splice sites
Reweighted robust discriminant analysis