期刊文献+

基于BLAST的数据清洗与质量控制方案 被引量:1

Data Cleaning and Quality Control Scheme Based on BLAST
下载PDF
导出
摘要 研究基本局部比对搜索工具(BLAST)在陆地植物系统发育平台中的应用。数据清洗方面结合基于基因注释的数据抽提与基于BLAST的相似性比对抽提,提取过滤相关的序列信息,控制序列质量,并剔除原始基因注释错误的序列。自测序列质量控制方面结合基于blastn的打分比对和基于blastp的模板比对,报告序列整体质量,控制污染序列和假基因的入库。 This paper researches the application of Basic Local Alignment Search Tool(BLAST) in the Platform for Phylogenetic Analysis of Land Plant Platform(PALPP). In data cleaning, it uses the data extraction based on gene annotation and extraction based on BLAST similarity matching to filter the related sequence information, control the sequence quality and remove the original gene sequence annotation errors. In the quality control of self-sequence data, it uses the way of alignment scoring based on blastn and template matching based on blastp to report the overall quality of sequence, control the storage of the pollution sequences and pseudo genes.
出处 《计算机工程》 CAS CSCD 北大核心 2011年第4期73-75,共3页 Computer Engineering
基金 中国科学院"十一五"重大专项基金资助项目"数据应用环境建设与服务"(O846061372 O846061108 O846061208)
关键词 序列比对 数据清洗 基本局部比对搜索工具 陆地植物系统发育平台 sequence alignment data cleaning Basic Local Alignment Search Tool(BLAST) Phylogenetic Analysis of Land Plant Platform (PALPP)
  • 相关文献

参考文献5

  • 1孟珍,陈之端,黎建辉,刘红梅,何星,林小光,张寿洲,李勇,胡良霖,周园春.陆地植物系统发育研究的工作平台构建[J].计算机工程,2010,36(20):272-274. 被引量:1
  • 2Altschul S F, Gish W, Miller W, et al. Basic Local Alignment Search Tool[J]. Journal of Molecular Biology, 1990, 215(3): 403- 410. 被引量:1
  • 3Altschul S F, Madden T L, Schaffer A A, et al. Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs[J]. Nucleic Acids Research, 1997, 25(17): 3389-3402. 被引量:1
  • 4Matsunaga A, Tsugawa M, Fortes J. CloudBLAST: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Applications[C]//Proc. of the 4th IEEE International Conference on E-Science. [S. l.]: IEEE Press, 2008. 被引量:1
  • 5Ewing B, Green E Basecalling of Automated .Sequencer Traces Using Phred. Ⅱ. Error Probabilities[J]. Genome Research, 1998, 8(3): 186-194. 被引量:1

二级参考文献10

  • 1Benton M J,Ayala F J.Dating the Tree of Life[J].Science,2003,300(5626):1698-1700. 被引量:1
  • 2Ciccarelli F D,Doerks T,von Mering C,et al.Toward Automatic Reconstruction of a Highly Resolved Tree of Life[J].Science,2006,311(5765):1283-1287. 被引量:1
  • 3Benson D A,Boguski M S,Lipman D J,et al.GenBank[J].Nucleic.Acids Reserch.,1999,27(1):12-17. 被引量:1
  • 4Hill Hall T A.BioEdit:a User-friendly Biological Sequence Alignment Editor for Windows 95/98/NT[J].Nucleic Acids Symposium Series.,,1999,41(1999):95-98. 被引量:1
  • 5Thompson J D,Clustal W.Improving the Sensitivity of Progressive Multiple Sequence Alignment Through Sequence Weighting,Position-specific Gap Penalties and Weight Matrix Choice[J].Nucleic Acids Research,1994,22(22):4673-4680. 被引量:1
  • 6Posada D,Crandall K A.Modeltest:Testing the Model of DNA Substitution[J].Bioinformatics,1998,14(9):817-818. 被引量:1
  • 7Huelsenbeck J P,Ronquist F.MRBAYES:Bayesian Interface of Phylogenetic Trees[J].Bioinformatics,2001,17(8):754-755. 被引量:1
  • 8Guindon S,Gascuel O.A Simple,Fast and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood[J].Systematic Biology,2003,52(5):696-704. 被引量:1
  • 9Zmasek C M,Eddy S R.ATV:Display and Manipulation of Annotated Phylogenetic Trees[J].Bioinformatics,2001,17(4):383-384. 被引量:1
  • 10毛国勇,张晓斌,谢江.面向生物信息学的网格问题求解平台[J].计算机工程,2010,36(11):253-255. 被引量:2

同被引文献11

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部