Recent deep sequencing surveys of mammalian genomes have unexpectedly revealed pervasive and complex transcription and identified tens of thousands of RNA transcripts that do not code for proteins. These non-coding RN...Recent deep sequencing surveys of mammalian genomes have unexpectedly revealed pervasive and complex transcription and identified tens of thousands of RNA transcripts that do not code for proteins. These non-coding RNAs(nc RNAs) highlight the central role of RNA in gene regulation. nc RNAs are arbitrarily divided into two main groups: The first includes small RNAs, such as mi RNAs, pi RNAs, and endogenous si RNAs, that usually range from 20 to 30 nt, while the second group includes long non-coding RNAs(lnc RNAs), which are typically more than 200 nt in length. These nc RNAs were initially thought to merely regulate gene expression at the post-transcriptional level, but recent studies have indicated that nc RNAs, especially lnc RNAs, are extensively associated with diverse chromatin remodeling complexes and target them to specific genomic loci to alter DNA methylation or histone status. These findings suggest an emerging theme of nc RNAs in epigenetic regulation. In this review, we discuss the wide spectrum of nc RNAs in the regulation of DNA methylation and chromatin state, as well as the key questions that needs to be investigated and acknowledging the elegant design of these intriguing macromolecules.展开更多
All eukaryotic genomes have genes with introns in variable sizes.As far as spliceosomal introns are concerned,there are at least three basic parameters to stratify introns across diverse eukaryotic taxa:size,number,an...All eukaryotic genomes have genes with introns in variable sizes.As far as spliceosomal introns are concerned,there are at least three basic parameters to stratify introns across diverse eukaryotic taxa:size,number,and sequence context.The number parameter is highly variable in lower eukaryotes,especially among protozoan and fungal species,which ranges from less than4%to 78%of the genes.Over greater evolutionary time scales,the number parameter undoubtedly increases as observed in higher plants and higher vertebrates,reaching greater than 12.5 exons per gene in average among mammalian genomes.The size parameter is more complex,where multiple modes appear at work.Aside from intronless genes,there are three other types of intron-containing genes:half-sized,minimal,and size-expandable introns.The half-sized introns have only been found in a limited number of genomes among protozoan and fungal lineages and the other two types are prevalent in all animal and plant genomes.Among the size-expandable introns,the sizes of plant introns are expansion-limited in that the large introns exceeding 1000 bp are fewer in numbers and transposon-free as compared to the large introns among animals,where the larger introns are filled with transposable elements and appear expansion-flexible,reaching several kilobasepairs(kbp)and even thousands of kbp in size.Most of the intron parameters can be studied as signatures of the specific splicing machineries of different eukaryotic lineages and are highly relevant to the regulation of gene expression and functionality.In particular,the transcription-splicing-export coupling of eukaryotic intron dispensing leads to a working hypothesis that all intron parameters are evolved to be efficient and function-related in processing and routing the spliced transcripts.展开更多
Artificial breeding is an important project to protect,recover and reintroduce endangered species.Knowledge of the population's genetic diversity at functional loci is important for the establishment of effective ...Artificial breeding is an important project to protect,recover and reintroduce endangered species.Knowledge of the population's genetic diversity at functional loci is important for the establishment of effective captive breeding programs.The major histocompatibility complex(MHC) genes are ideal candidate genetic markers to inform planned breeding,due to their high levels of polymorphism and importance in the main immune coding region of the vertebrate genome.In this study,we constructed BAC-based contigs and isolated six functional MHC class Ⅰ genes from the giant panda(Ailuropoda melanoleuca),which we designated Aime-C,Aime-F,Aime-I,Aime-K,Aime-L and Aime-1906.Analyses of the tissue expression patterns and full-length cDNA sequences of these class I genes revealed that Aime-C,-F,-I and-L could be considered classical class Ⅰ loci,due to their extensive expression patterns and normal exonic structures.In contrast,Aime-K and-1906 appeared to be nonclassical genes based on their tissue-specific expression patterns and the presence of an abnormal exon 7 in both genes.We established techniques for genotyping exons 2 and 3 of the classical loci using locus-specific single strand conformation polymorphism(SSCP) and sequence analysis.In the Chengdu captive population,we identified one monomorphic locus(Aime-F) and three polymorphic loci with different numbers of alleles(4/4/4 exon 2 alleles at Aime-C/I/L and 6/5/5 exon 3 alleles at Aime-C/I/L).The distributions of the Aime-C,-I and-L alleles among members of different families were in good agreement with the known pedigree relationships,suggesting that the genotyping results are reliable.Therefore,the MHC-I genotyping techniques established in this study may provide a powerful tool for the future design of scientific breeding or release/reintroduction programs.展开更多
One of the most exciting findings in RNA biology is the discovery of numerous circular RNAs (circRNA) in mammalian genome. Once being considered as low abundance splicing byproducts, circRNAs are surprisingly abunda...One of the most exciting findings in RNA biology is the discovery of numerous circular RNAs (circRNA) in mammalian genome. Once being considered as low abundance splicing byproducts, circRNAs are surprisingly abundant and can be generated by multiple pathways. The majority of circRNAs are generated from the RNA backsplicing in which an upstream 3' splicing site (ss) is joined with a downstream 5' ss. Several groups have independently demonstrated that the complementary paring of intronic sequences is sufficient to promote the biogenesis of circRNA via backsplicing. In addition, intronic circRNAs can also be generated through partial degradation of lariat RNAs that are splicing byproduct.展开更多
Epigenetic changes caused by DNA methylation and histone modifications play important roles in the regulation of various cellular processes and development. Recent discoveries of 5-methylcytosine(5m C) oxidation deriv...Epigenetic changes caused by DNA methylation and histone modifications play important roles in the regulation of various cellular processes and development. Recent discoveries of 5-methylcytosine(5m C) oxidation derivatives including 5-hydroxymethylcytosine(5hm C), 5-formylcytsine(5f C) and 5-carboxycytosine(5ca C) in mammalian genome further expand our understanding of the epigenetic regulation. Analysis of DNA modification patterns relies increasingly on sequencing-based profiling methods. A number of different approaches have been established to map the DNA epigenomes with single-base resolution, as represented by the bisulfite-based methods, such as classical bisulfite sequencing(BS-seq), TAB-seq(TET-assisted bisulfite sequencing), ox BS-seq(oxidative bisulfite sequencing) and etc. These methods have been used to generate base-resolution maps of 5m C and its oxidation derivatives in genomic samples. The focus of this review will be to discuss the chemical methodologies that have been developed to detect the cytosine derivatives in the genomic DNA.展开更多
The advent of high throughput technologies has revealed that mammalian genomes are pervasively transcribed, most for long noncoding RNAs (lncRNAs, at least 200 nt long). Thousands of lncRNAs from intergenic regions ...The advent of high throughput technologies has revealed that mammalian genomes are pervasively transcribed, most for long noncoding RNAs (lncRNAs, at least 200 nt long). Thousands of lncRNAs from intergenic regions (large in- tergenic noncoding RNA, lincRNA) have been uncovered by massive deep sequencing from the repertoire of polyadenylat- ed (poly(A)+) RNAs, together with multiple chromatin land-scapes. These lncRNAs are messenger RNA (mRNA)-like, with linear signatures of 5' mVG caps and 3' poly(A)+ tails. Unex- pectedly, mammalian transcriptomes are even more complex with the expression of RNAs without polyadenylated tails (poly(A)- RNAs) [1], leading to the identification of new lncRNA formats, such as circular RNAs.展开更多
文摘Recent deep sequencing surveys of mammalian genomes have unexpectedly revealed pervasive and complex transcription and identified tens of thousands of RNA transcripts that do not code for proteins. These non-coding RNAs(nc RNAs) highlight the central role of RNA in gene regulation. nc RNAs are arbitrarily divided into two main groups: The first includes small RNAs, such as mi RNAs, pi RNAs, and endogenous si RNAs, that usually range from 20 to 30 nt, while the second group includes long non-coding RNAs(lnc RNAs), which are typically more than 200 nt in length. These nc RNAs were initially thought to merely regulate gene expression at the post-transcriptional level, but recent studies have indicated that nc RNAs, especially lnc RNAs, are extensively associated with diverse chromatin remodeling complexes and target them to specific genomic loci to alter DNA methylation or histone status. These findings suggest an emerging theme of nc RNAs in epigenetic regulation. In this review, we discuss the wide spectrum of nc RNAs in the regulation of DNA methylation and chromatin state, as well as the key questions that needs to be investigated and acknowledging the elegant design of these intriguing macromolecules.
基金supported by the National Natural Science Foundation of China(31101063,31271386)National Basic Research Program of China(2010CB126604,2011CB944100,2011CB944101)
文摘All eukaryotic genomes have genes with introns in variable sizes.As far as spliceosomal introns are concerned,there are at least three basic parameters to stratify introns across diverse eukaryotic taxa:size,number,and sequence context.The number parameter is highly variable in lower eukaryotes,especially among protozoan and fungal species,which ranges from less than4%to 78%of the genes.Over greater evolutionary time scales,the number parameter undoubtedly increases as observed in higher plants and higher vertebrates,reaching greater than 12.5 exons per gene in average among mammalian genomes.The size parameter is more complex,where multiple modes appear at work.Aside from intronless genes,there are three other types of intron-containing genes:half-sized,minimal,and size-expandable introns.The half-sized introns have only been found in a limited number of genomes among protozoan and fungal lineages and the other two types are prevalent in all animal and plant genomes.Among the size-expandable introns,the sizes of plant introns are expansion-limited in that the large introns exceeding 1000 bp are fewer in numbers and transposon-free as compared to the large introns among animals,where the larger introns are filled with transposable elements and appear expansion-flexible,reaching several kilobasepairs(kbp)and even thousands of kbp in size.Most of the intron parameters can be studied as signatures of the specific splicing machineries of different eukaryotic lineages and are highly relevant to the regulation of gene expression and functionality.In particular,the transcription-splicing-export coupling of eukaryotic intron dispensing leads to a working hypothesis that all intron parameters are evolved to be efficient and function-related in processing and routing the spliced transcripts.
基金supported by the National Basic Research Program of China(2007CB411600)the State Forestry Administration of China (WH0627)the Fundamental Research Funds for the Central Universities of China
文摘Artificial breeding is an important project to protect,recover and reintroduce endangered species.Knowledge of the population's genetic diversity at functional loci is important for the establishment of effective captive breeding programs.The major histocompatibility complex(MHC) genes are ideal candidate genetic markers to inform planned breeding,due to their high levels of polymorphism and importance in the main immune coding region of the vertebrate genome.In this study,we constructed BAC-based contigs and isolated six functional MHC class Ⅰ genes from the giant panda(Ailuropoda melanoleuca),which we designated Aime-C,Aime-F,Aime-I,Aime-K,Aime-L and Aime-1906.Analyses of the tissue expression patterns and full-length cDNA sequences of these class I genes revealed that Aime-C,-F,-I and-L could be considered classical class Ⅰ loci,due to their extensive expression patterns and normal exonic structures.In contrast,Aime-K and-1906 appeared to be nonclassical genes based on their tissue-specific expression patterns and the presence of an abnormal exon 7 in both genes.We established techniques for genotyping exons 2 and 3 of the classical loci using locus-specific single strand conformation polymorphism(SSCP) and sequence analysis.In the Chengdu captive population,we identified one monomorphic locus(Aime-F) and three polymorphic loci with different numbers of alleles(4/4/4 exon 2 alleles at Aime-C/I/L and 6/5/5 exon 3 alleles at Aime-C/I/L).The distributions of the Aime-C,-I and-L alleles among members of different families were in good agreement with the known pedigree relationships,suggesting that the genotyping results are reliable.Therefore,the MHC-I genotyping techniques established in this study may provide a powerful tool for the future design of scientific breeding or release/reintroduction programs.
文摘One of the most exciting findings in RNA biology is the discovery of numerous circular RNAs (circRNA) in mammalian genome. Once being considered as low abundance splicing byproducts, circRNAs are surprisingly abundant and can be generated by multiple pathways. The majority of circRNAs are generated from the RNA backsplicing in which an upstream 3' splicing site (ss) is joined with a downstream 5' ss. Several groups have independently demonstrated that the complementary paring of intronic sequences is sufficient to promote the biogenesis of circRNA via backsplicing. In addition, intronic circRNAs can also be generated through partial degradation of lariat RNAs that are splicing byproduct.
基金supported by the National Basic Research Foundation of China(2014CB964900 to Yi Chengqi)the National Natural Science Foundation of China(3127083821472009 to Yi Chengqi)
文摘Epigenetic changes caused by DNA methylation and histone modifications play important roles in the regulation of various cellular processes and development. Recent discoveries of 5-methylcytosine(5m C) oxidation derivatives including 5-hydroxymethylcytosine(5hm C), 5-formylcytsine(5f C) and 5-carboxycytosine(5ca C) in mammalian genome further expand our understanding of the epigenetic regulation. Analysis of DNA modification patterns relies increasingly on sequencing-based profiling methods. A number of different approaches have been established to map the DNA epigenomes with single-base resolution, as represented by the bisulfite-based methods, such as classical bisulfite sequencing(BS-seq), TAB-seq(TET-assisted bisulfite sequencing), ox BS-seq(oxidative bisulfite sequencing) and etc. These methods have been used to generate base-resolution maps of 5m C and its oxidation derivatives in genomic samples. The focus of this review will be to discuss the chemical methodologies that have been developed to detect the cytosine derivatives in the genomic DNA.
文摘The advent of high throughput technologies has revealed that mammalian genomes are pervasively transcribed, most for long noncoding RNAs (lncRNAs, at least 200 nt long). Thousands of lncRNAs from intergenic regions (large in- tergenic noncoding RNA, lincRNA) have been uncovered by massive deep sequencing from the repertoire of polyadenylat- ed (poly(A)+) RNAs, together with multiple chromatin land-scapes. These lncRNAs are messenger RNA (mRNA)-like, with linear signatures of 5' mVG caps and 3' poly(A)+ tails. Unex- pectedly, mammalian transcriptomes are even more complex with the expression of RNAs without polyadenylated tails (poly(A)- RNAs) [1], leading to the identification of new lncRNA formats, such as circular RNAs.