期刊文献+

基于仿射聚类的宏基因组序列物种聚类 被引量:1

Metagenomic DNA Sequence Binning based on Affinity Propagation
下载PDF
导出
摘要 随着下一代测序技术的迅猛发展,宏基因组学已经成为新的研究热点,宏基因组学序列聚类问题使用无参考的方法,对包含多个物种的宏基因组序列进行有效分离.为此,提出一种结合相似度信息和结构信息的宏基因组物种聚类算法,并引入仿射聚类来进行序列物种聚类.实验数据表明该方法聚类精度高、执行速度快.我们也开发了基于该方法的宏基因组序列物种聚类软件. Nowadays, with the rapid development of the next generation sequencing technologies, metagenomics have become a new hotspot, However research in metagenomics faces the issue of binning --- identification and taxonomic characterization of the NGS short reads. To solve this problem, this paper first analyzes the next generation sequencing technology characteristics, statistical characteristics of metagenomic sequence, then proposes a new clustering method for DNA sequence binning. Test results show that this method has a very good clustering accuracy. In the same time, we developed an software for metagenornic binning based on this algzorithm MetaBinning.
出处 《计算机系统应用》 2013年第11期165-170,142,共7页 Computer Systems & Applications
基金 国家自然科学基金面上项目(60970085)
关键词 宏基因组学 DNA序列 物种聚类 仿射聚类 倒排索引 metagenomics DNA sequence binning affinity propagation inverted index
  • 相关文献

参考文献17

  • 1Mardis ER. Next-generation DNA sequencing methods. Annu.Rev. Genomics Hum, 2008, 9: 387402. 被引量:1
  • 2Wendl M, Waterston R. Generalized gap model for bacterialartificial chromosome clone fingerprint mapping and shotgunsequencing. Genome Res, 2002,12(1): 1943-1949. 被引量:1
  • 3Gill SR, Pop M, DeBoy RT. Metagenomic analysis of thehuman distal gut microbiome. Science, June 2006, 312:1355-1359. 被引量:1
  • 4McHardy A, Marin H, Tsirigos A, Hugenholtz P,Rigoutsos I.Accurate phylogenetic classification of variable-length dnafragments. Nauture Methods, 2007,4(1): 63-72. 被引量:1
  • 5Sandberg R, Winberg Q Branden CI. Capturing whole-genome characteristics in short sequences using a naiveBayesian classifier. Genome Research, 2001,11(8): 1404-1409. 被引量:1
  • 6Diaz N, Krause L,Goesmann A. TACOA-Taxonomic classil-cation of environmental genomic fragments using a kernel iz-ed nearest neighbor approach. BMC Bioinformatics, 2009,10(1): 56. 被引量:1
  • 7Wu YW, Ye YZ. A novel abundance-based algori- thm forbinning metagenomic sequences using 1-tuples. Proc. of the14th annual international conference (RECOMB’10).Springer. 2010. 535-549. 被引量:1
  • 8Yang B. MetaCluster: unsupervised binning of environ-mental genomic fragments and taxonomic annotation. Proc.of the ACM Conference on Bioinformatics, Computational.Biology and Biomedicine(ACM-BCB). 2010. 170-179. 被引量:1
  • 9Leung HCM,Yin SM,Yang B. A robust and accurate binningalgorithm for metagenomic sequences with arbitrary speciesabundance ratio. Bioinformatics, 2011,27: 1489-1495. 被引量:1
  • 10Chatterji S,Yamazaki I,Bai Z. Compostbin: a DNAcomposition-based algorithm for binning environmentalshotgun reads. Proc. of the 12th annual internationalconference (RECOMB’OS). Springer. 2008.17-28. 被引量:1

二级参考文献18

  • 1Zhang Hongjiang, Wang J Y A, Altunbasak Y, Content - based video retrieval and compression: A unifed solution[A]//IEEE International Conference on Image Processing[C]. Washington, DC, 1997:13-16. 被引量:1
  • 2Worf W. Key frame selection by motion analysis[C]///Proceedings of the 1996 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Atlanta, 1996 : 1228- 1231. 被引量:1
  • 3Frey B J, Dueck D. Clustering by passing messages between data points[J]. Science, 2007,315 (5814) : 972-976. 被引量:1
  • 4Velamuru P K,Renaut R A,Guo H B,et al. Robust clustering of positron emission tomograpby data[C]//Joint Interface CSNA. USA, 2005. 被引量:1
  • 5Frey B J, Dueck D. Clustering by passing messages between data points. Science, 2007, 315(5814): 972-976 被引量:1
  • 6Kelly K. Affinity program slashes computing times [Online], available: http://www.news.utoronto.ca/bin6/070215-2952. asp. October 25, 2007 被引量:1
  • 7Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biology, 2002, 3(7): 1-21 被引量:1
  • 8Wang K J. Supplement of adaptive affinity propagation clustering [Online], available: http://www.mathworks. com/matlabcentral/fileexchange/loadAut hor .do?object Type =author&objectId=1095267, October 25, 2007 被引量:1
  • 9Velamuru P K, Renaut R A, Guo H B, Chen K W. Robust clustering of positron emission tomography data. In: Joint Interface CSNA. USA: 2005 被引量:1
  • 10Dembele D, Kastner P. Fuzzy C-means method for clustering microarray data. Bioinformatics, 2003, 19(8): 973-980 被引量:1

共引文献152

同被引文献18

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部