Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, tran...Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, transcriptome, and epigenetics research. The highly-contiguous de novo assemblies using PacBio sequencing can close gaps in current reference assemblies and characterize structural variation (SV) in personal genomes. With longer reads, we can sequence through extended repetitive regions and detect mutations, many of which are associated with dis- eases. Moreover, PacBio transcriptome sequencing is advantageous for the identification of gene isoforms and facilitates reliable discoveries of novel genes and novel isoforms of annotated genes, due to its ability to sequence full-length transcripts or fragments with significant lengths. Addition- ally, PacBio's sequencing technique provides information that is useful for the direct detection of base modifications, such as methylation. In addition to using PacBio sequencing alone, many hybrid sequencing strategies have been developed to make use of more accurate short reads in conjunction with PacBio long reads. In general, hybrid sequencing strategies are more affordable and scalable especially for small-size laboratories than using PacBio Sequencing alone. The advent of PacBio sequencing has made available much information that could not be obtained via SGS alone.展开更多
Tartary buckwheat (Fagopyrum tataricum) is an important pseudocereal crop that is strongly adapted to growth in adverse environments. Its gluten-free grain contains complete proteins with a well-balanced composition...Tartary buckwheat (Fagopyrum tataricum) is an important pseudocereal crop that is strongly adapted to growth in adverse environments. Its gluten-free grain contains complete proteins with a well-balanced composition of essential amino acids and is a rich source of beneficial phytochemicals that provide significant health benefits. Here, we report a high-quality, chromosome-scale Tartary buckwheat genome sequence of- 489.3 Mb that is assembled by combining whole-genome shotgun sequencing of both Illumina short reads and single-molecule real-time long reads, sequence tags of a large DNA insert fosmid library, Hi-C sequencing data, and BioNano genome maps. We annotated 33 366 high-confidence protein-coding genes based on expression evidence. Comparisons of the intra-genome with the sugar beet genome revealed an independent whole-genome duplication that occurred in the buckwheat lineage after they diverged from the common ancestor, which was not shared with rosids or asterids. The reference genome facilitated the identification of many new genes predicted to be involved in rutin biosynthesis and regulation, aluminum stress resistance, and in drought and cold stress responses. Our data suggest that Tartary buckwheat's ability to tolerate high levels of abiotic stress is attributed to the expansion of several gene families involved in signal transduction, gene regulation, and membrane transport. The availability of these genomic resources will facilitate the discovery of agronomically and nutritionally important genes and genetic improvement of Tartary buckwheat.展开更多
The revolution of genome sequencing is continuing after the successful secondgeneration sequencing (SGS) technology. The third-generation sequencing (TGS) technology, led by Pacific Biosciences (PacBio), is prog...The revolution of genome sequencing is continuing after the successful secondgeneration sequencing (SGS) technology. The third-generation sequencing (TGS) technology, led by Pacific Biosciences (PacBio), is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that promises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT). MiniON identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MiniON has thus generated much excitement and interest in the genomics community. While de novo genome assemblies can be cheaply produced from SGS data, assem- bly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in genome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.展开更多
Current global agricultural production must feed over 7 billion people.However,productivity varies greatly across the globe and is under threat from both increased competitions for land and climate change and associat...Current global agricultural production must feed over 7 billion people.However,productivity varies greatly across the globe and is under threat from both increased competitions for land and climate change and associated environmental deterioration.Moreover,the increase in human population size and dietary changes are putting an ever greater burden on agriculture.The majority of this burden is met by the cultivation of a very small number of species,largely in locations that differ from their origin of domestication.Recent technological advances have raised the possibility of de novo domestication of wild plants as a viable solution for designing ideal crops while maintaining food security and a more sustainable lowinput agriculture.Here we discuss how the discovery of multiple key domestication genes alongside the development of technologies for accurate manipulation of several target genes simultaneously renders de novo domestication a route toward crops for the future.展开更多
Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to ob...Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts(especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome-guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.展开更多
基金supported by the institutional fund of the Department of Internal Medicine, University of Iowa, USA
文摘Single-molecule, real-time sequencing developed by Pacific BioSciences offers longer read lengths than the second-generation sequencing (SGS) technologies, making it well-suited for unsolved problems in genome, transcriptome, and epigenetics research. The highly-contiguous de novo assemblies using PacBio sequencing can close gaps in current reference assemblies and characterize structural variation (SV) in personal genomes. With longer reads, we can sequence through extended repetitive regions and detect mutations, many of which are associated with dis- eases. Moreover, PacBio transcriptome sequencing is advantageous for the identification of gene isoforms and facilitates reliable discoveries of novel genes and novel isoforms of annotated genes, due to its ability to sequence full-length transcripts or fragments with significant lengths. Addition- ally, PacBio's sequencing technique provides information that is useful for the direct detection of base modifications, such as methylation. In addition to using PacBio sequencing alone, many hybrid sequencing strategies have been developed to make use of more accurate short reads in conjunction with PacBio long reads. In general, hybrid sequencing strategies are more affordable and scalable especially for small-size laboratories than using PacBio Sequencing alone. The advent of PacBio sequencing has made available much information that could not be obtained via SGS alone.
文摘Tartary buckwheat (Fagopyrum tataricum) is an important pseudocereal crop that is strongly adapted to growth in adverse environments. Its gluten-free grain contains complete proteins with a well-balanced composition of essential amino acids and is a rich source of beneficial phytochemicals that provide significant health benefits. Here, we report a high-quality, chromosome-scale Tartary buckwheat genome sequence of- 489.3 Mb that is assembled by combining whole-genome shotgun sequencing of both Illumina short reads and single-molecule real-time long reads, sequence tags of a large DNA insert fosmid library, Hi-C sequencing data, and BioNano genome maps. We annotated 33 366 high-confidence protein-coding genes based on expression evidence. Comparisons of the intra-genome with the sugar beet genome revealed an independent whole-genome duplication that occurred in the buckwheat lineage after they diverged from the common ancestor, which was not shared with rosids or asterids. The reference genome facilitated the identification of many new genes predicted to be involved in rutin biosynthesis and regulation, aluminum stress resistance, and in drought and cold stress responses. Our data suggest that Tartary buckwheat's ability to tolerate high levels of abiotic stress is attributed to the expansion of several gene families involved in signal transduction, gene regulation, and membrane transport. The availability of these genomic resources will facilitate the discovery of agronomically and nutritionally important genes and genetic improvement of Tartary buckwheat.
基金supported by the Wellcome Trust,the United Kingdom
文摘The revolution of genome sequencing is continuing after the successful secondgeneration sequencing (SGS) technology. The third-generation sequencing (TGS) technology, led by Pacific Biosciences (PacBio), is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that promises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT). MiniON identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MiniON has thus generated much excitement and interest in the genomics community. While de novo genome assemblies can be cheaply produced from SGS data, assem- bly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in genome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.
基金the Max Planck Society.J.Y.is supported by the National Key Research and Development Program of China(2016YFD0101003)the National Natural Science Foundation of China(31525017+1 种基金31730064)the Fundamental Research Funds for the Central Un iversities.
文摘Current global agricultural production must feed over 7 billion people.However,productivity varies greatly across the globe and is under threat from both increased competitions for land and climate change and associated environmental deterioration.Moreover,the increase in human population size and dietary changes are putting an ever greater burden on agriculture.The majority of this burden is met by the cultivation of a very small number of species,largely in locations that differ from their origin of domestication.Recent technological advances have raised the possibility of de novo domestication of wild plants as a viable solution for designing ideal crops while maintaining food security and a more sustainable lowinput agriculture.Here we discuss how the discovery of multiple key domestication genes alongside the development of technologies for accurate manipulation of several target genes simultaneously renders de novo domestication a route toward crops for the future.
基金supported by the National High Technology Research and Development Program of China(2015AA020104)the China Human Proteome Project(2014DFB30010)+1 种基金the National Science Foundation of China(31471239,to Leming Shi)the 111 Project(B13016)
文摘Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts(especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome-guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.