Genome sequencing is the process of determining in which order the nitrogenous bases also known as nucleotides within a DNA molecule are arranged. Every organism’s genome consists of a unique sequence of nucleotides....Genome sequencing is the process of determining in which order the nitrogenous bases also known as nucleotides within a DNA molecule are arranged. Every organism’s genome consists of a unique sequence of nucleotides. These nucleotides bases provide the phenotypes and genotypes of a cell. In mathematics, Graph theory is the study of mathematical objects known as graphs which are made of vertices (or nodes) connected by either directed edges or indirect edges. Determining the sequence in which these nucleotides are bonded can help scientists and researchers to compare DNA between organisms, which can help show how the organisms are related. In this research, we study how graph theory plays a vital part in genome sequencing and different types of graphs used during DNA sequencing. We are going to propose several ways graph theory is used to sequence the genome. We are as well, going to explore how the graphs like Hamiltonian graph, Euler graph, and de Bruijn graphs are used to sequence the genome and advantages and disadvantages associated with each graph.展开更多
Reconstruction of transcriptome by de novo assembly from next generation sequencing (NGS) short-sequence reads provides an essential mean to catalog expressed genes, identify splicing isoforms, and capture the expre...Reconstruction of transcriptome by de novo assembly from next generation sequencing (NGS) short-sequence reads provides an essential mean to catalog expressed genes, identify splicing isoforms, and capture the expression detail of transcripts for organisms with no reference genome available. De novo transcriptome assembly faces many unique challenges, including alternative splicing, variable expression level covering a dynamic range of several orders of magnitude, artifacts introduced by reverse transcription, etc. In the current review, we illustrate the grand strategy in applying De Bruijn Graph (DBG) approach in de novo transcriptome assembly. We further analyze many parameters proven critical in transcriptome assembly using DBG. Among them, k-met length, coverage depth of reads, genome complexity, performance of different programs are addressed in greater details. A multi-k-mer strategy balancing efficiency and sensitivity is discussed and highly recommended for de novo transcriptome assembly. Future direction points to the combination of NGS and third generation sequencing technology that would greatly enhance the power of de novo transcriptomics study.展开更多
Superposition of signals in DNA molecule is a sufficiently general principle of information coding. The necessary re-quirement for such superposition is the degeneracy of the code, which allows placing different messa...Superposition of signals in DNA molecule is a sufficiently general principle of information coding. The necessary re-quirement for such superposition is the degeneracy of the code, which allows placing different messages on the same DNA fragment. Code words that are equivalent in the informational sense (i.e., synonyms) form synonymous group and the entire set of code words is partitioned into synonymous groups. This paper is dedicated to constructing and analyzing the model of synonymous coding. We evaluate some characteristics of synonymous coding as applied to code words of length two although many definitions may be extended for words of arbitrary length.展开更多
de Bruijn定理是一种重要的组合计数方法,本文以非常自然的方式推广了这种方法.P-图是图G在其顶点上的置换群P作用下形成的轨道.文中引进了P-图,P-图的色容指标,P-图关于色置换群H的色权多项式以及色对称与全色对称图等概念,建立了色权...de Bruijn定理是一种重要的组合计数方法,本文以非常自然的方式推广了这种方法.P-图是图G在其顶点上的置换群P作用下形成的轨道.文中引进了P-图,P-图的色容指标,P-图关于色置换群H的色权多项式以及色对称与全色对称图等概念,建立了色权多项式的计算公式和一系列的组合公式及性质.展开更多
文摘Genome sequencing is the process of determining in which order the nitrogenous bases also known as nucleotides within a DNA molecule are arranged. Every organism’s genome consists of a unique sequence of nucleotides. These nucleotides bases provide the phenotypes and genotypes of a cell. In mathematics, Graph theory is the study of mathematical objects known as graphs which are made of vertices (or nodes) connected by either directed edges or indirect edges. Determining the sequence in which these nucleotides are bonded can help scientists and researchers to compare DNA between organisms, which can help show how the organisms are related. In this research, we study how graph theory plays a vital part in genome sequencing and different types of graphs used during DNA sequencing. We are going to propose several ways graph theory is used to sequence the genome. We are as well, going to explore how the graphs like Hamiltonian graph, Euler graph, and de Bruijn graphs are used to sequence the genome and advantages and disadvantages associated with each graph.
基金ACKNOWLEDGEMENTS This work is supported in part by grants from the National Basic Research Program of China (Nos. 2012CB316501, 2012CB517905 and 2013CB 127000) and the National Natural Science Foundation of China (Nos. 31571310 and 31271409).
文摘Reconstruction of transcriptome by de novo assembly from next generation sequencing (NGS) short-sequence reads provides an essential mean to catalog expressed genes, identify splicing isoforms, and capture the expression detail of transcripts for organisms with no reference genome available. De novo transcriptome assembly faces many unique challenges, including alternative splicing, variable expression level covering a dynamic range of several orders of magnitude, artifacts introduced by reverse transcription, etc. In the current review, we illustrate the grand strategy in applying De Bruijn Graph (DBG) approach in de novo transcriptome assembly. We further analyze many parameters proven critical in transcriptome assembly using DBG. Among them, k-met length, coverage depth of reads, genome complexity, performance of different programs are addressed in greater details. A multi-k-mer strategy balancing efficiency and sensitivity is discussed and highly recommended for de novo transcriptome assembly. Future direction points to the combination of NGS and third generation sequencing technology that would greatly enhance the power of de novo transcriptomics study.
文摘Superposition of signals in DNA molecule is a sufficiently general principle of information coding. The necessary re-quirement for such superposition is the degeneracy of the code, which allows placing different messages on the same DNA fragment. Code words that are equivalent in the informational sense (i.e., synonyms) form synonymous group and the entire set of code words is partitioned into synonymous groups. This paper is dedicated to constructing and analyzing the model of synonymous coding. We evaluate some characteristics of synonymous coding as applied to code words of length two although many definitions may be extended for words of arbitrary length.