目的 :建立通用的蛋白质、核酸序列信息数据类型和数据库结构 ,为建立专用的生物医学信息数据库奠定基础。设计自动获取数据的软件 ,彻底改变手动获取数据的方式。方法 :以美国国立生物技术信息中心 (NCBI)建立的GenBank数据库和欧洲分子生物信息学实验室 (EBI)维护的Swiss Prot数据库分别作为一级和二级数据库的信息来源 ;通过分析 ,设计新的数据模型 ;建立本地数据库。编写了专用软件从网上获取数据 ,并且进行分类整理 ,存入数据库 ,同时利用软件的管理和查询功能对于本地数据进行管理和检索。结果 :建立核酸、蛋白质常用信息数据库 ,设计相关信息获取、管理软件。以心衰相关基因数据库为例 ,利用所设计系统实现专业生物信息数据库。提出网上发布的方案。结论 :结合生物技术研究的需要 ,设计软件获取网上的生物信息数据 ,可以大大地提高效率。将不同的数据库数据结合起来 ,建立新的数据类型和数据分类 ,可以改进查询方式 。
Objective: To creat a heart disorder (heart failure) related gene database, whose data are retrieved from Internet resources to lay a foundation for developing professional biomedical databases. Methods: Through analysing GenBank and Swiss Prot, a relational model was designed. It included various characteristics of protein and nucleotide of genes that were related to cardiovascular diseases. A new set of software tools was developed to search for and retrieve a lot of specific data of genes, protein and nucleotide from GenBank and Swiss Prot, etc. Data collecting was speeded up by using the batch search and level retrievals. Results: A general heart disorder (hypertension/heart failure) bioinformatic database was constructed, which was composed of 15 tables and 6 views. It included symbols and sequences of nucleic acids and proteins, structures of proteins, chromosomal mappings of genes, sources of genes, phenotypes and functions of genes. Also there were some useful web links, such as PubMed, OMIM, HomolGene, LocusLink, GDB, SNPs, GeneCards, and HGMD. Conclusion: The novel set of software tools which could automatically retrieve data from more data sources on Internet is helpful for rapidly collecting specific data. In the future, the disorder related gene database could be a fundamental resource of biomedical research.
Journal of Peking University:Health Sciences
国家重点基础研究发展规划项目 (G2 0 0 0 0 56 90 7)资助~~