摘要
研究基本局部比对搜索工具(BLAST)在陆地植物系统发育平台中的应用。数据清洗方面结合基于基因注释的数据抽提与基于BLAST的相似性比对抽提,提取过滤相关的序列信息,控制序列质量,并剔除原始基因注释错误的序列。自测序列质量控制方面结合基于blastn的打分比对和基于blastp的模板比对,报告序列整体质量,控制污染序列和假基因的入库。
This paper researches the application of Basic Local Alignment Search Tool(BLAST) in the Platform for Phylogenetic Analysis of Land Plant Platform(PALPP). In data cleaning, it uses the data extraction based on gene annotation and extraction based on BLAST similarity matching to filter the related sequence information, control the sequence quality and remove the original gene sequence annotation errors. In the quality control of self-sequence data, it uses the way of alignment scoring based on blastn and template matching based on blastp to report the overall quality of sequence, control the storage of the pollution sequences and pseudo genes.
出处
《计算机工程》
CAS
CSCD
北大核心
2011年第4期73-75,共3页
Computer Engineering
基金
中国科学院"十一五"重大专项基金资助项目"数据应用环境建设与服务"(O846061372
O846061108
O846061208)
关键词
序列比对
数据清洗
基本局部比对搜索工具
陆地植物系统发育平台
sequence alignment
data cleaning
Basic Local Alignment Search Tool(BLAST)
Phylogenetic Analysis of Land Plant Platform (PALPP)