Plant genomes are so highly diverse that a substantial proportion of genomic sequences are not shared among individuals.The variable DNA sequences,along with the conserved core sequences,compose the more sophisticated...Plant genomes are so highly diverse that a substantial proportion of genomic sequences are not shared among individuals.The variable DNA sequences,along with the conserved core sequences,compose the more sophisticated pan-genome that represents the collection of all non-redundant DNA in a species.With rapid progress in genome sequencing technologies,pan-genome research in plants is now accelerating.Here we review recent advances in plant pan-genomics,including major driving forces of structural variations that constitute the variable sequences,methodological innovations for representing the pan-genome,and major successes in constructing plant pan-genomes.We also summarize recent efforts toward decoding the remaining dark matter in telomere-to-telomere or gapless plant genomes.These new genome resources,which have remarkable advantages over numerous previously assembled less-than-perfect genomes,are expected to become new references for genetic studies and plant breeding.展开更多
Pan-genomics can encompass most of the genetic diversity of a species or population and has proved to be a powerful tool for studying genomic evolution and the origin and domestication of species,and for providing inf...Pan-genomics can encompass most of the genetic diversity of a species or population and has proved to be a powerful tool for studying genomic evolution and the origin and domestication of species,and for providing information for plant improvement.Plant genomics has greatly progressed because of improvements in sequencing technologies and the rapid reduction of sequencing costs.Nevertheless,pangenomics still presents many challenges,including computationally intensive assembly methods,high costs with large numbers of samples,ineffective integration of big data,and difficulty in applying it to downstream multi-omics analysis and breeding research.In this review,we summarize the definition and recent achievements of plant pan-genomics,computational technologies used for pan-genome construction,and the applications of pan-genomes in plant genomics and molecular breeding.We also discuss challenges and perspectives for future pan-genomics studies and provide a detailed pipeline for sample selection,genome assembly and annotation,structural variation identification,and construction and application of graph-based pan-genomes.The aim is to provide important guidance for plant pan-genome research and a better understanding of the genetic basis of genome evolution,crop domestication,and phenotypic diversity for future studies.展开更多
基金National Natural Science Foundation of China(31825015 to X.H.31901596 to J.S.)Young Elite Scientists Sponsorship Program by CAST(2021QNRC001 toJ.S.).
文摘Plant genomes are so highly diverse that a substantial proportion of genomic sequences are not shared among individuals.The variable DNA sequences,along with the conserved core sequences,compose the more sophisticated pan-genome that represents the collection of all non-redundant DNA in a species.With rapid progress in genome sequencing technologies,pan-genome research in plants is now accelerating.Here we review recent advances in plant pan-genomics,including major driving forces of structural variations that constitute the variable sequences,methodological innovations for representing the pan-genome,and major successes in constructing plant pan-genomes.We also summarize recent efforts toward decoding the remaining dark matter in telomere-to-telomere or gapless plant genomes.These new genome resources,which have remarkable advantages over numerous previously assembled less-than-perfect genomes,are expected to become new references for genetic studies and plant breeding.
基金supported by the National Natural Science Foundation of China(32100500)the Natural Science Foundation of Hebei Province(C2021201048)Interdisciplinary Research Program of Natural Science of Hebei University。
文摘Pan-genomics can encompass most of the genetic diversity of a species or population and has proved to be a powerful tool for studying genomic evolution and the origin and domestication of species,and for providing information for plant improvement.Plant genomics has greatly progressed because of improvements in sequencing technologies and the rapid reduction of sequencing costs.Nevertheless,pangenomics still presents many challenges,including computationally intensive assembly methods,high costs with large numbers of samples,ineffective integration of big data,and difficulty in applying it to downstream multi-omics analysis and breeding research.In this review,we summarize the definition and recent achievements of plant pan-genomics,computational technologies used for pan-genome construction,and the applications of pan-genomes in plant genomics and molecular breeding.We also discuss challenges and perspectives for future pan-genomics studies and provide a detailed pipeline for sample selection,genome assembly and annotation,structural variation identification,and construction and application of graph-based pan-genomes.The aim is to provide important guidance for plant pan-genome research and a better understanding of the genetic basis of genome evolution,crop domestication,and phenotypic diversity for future studies.