Genetic variation ranging from single-nucleotide polymorphisms to large structural variants (SVs) can cause variation of gene content among individuals within the same species. There is an increasing appreciation that...Genetic variation ranging from single-nucleotide polymorphisms to large structural variants (SVs) can cause variation of gene content among individuals within the same species. There is an increasing appreciation that a single reference genome is insufficient to capture the full landscape of genetic diversity of a species. Pan-genome analysis offers a platform to evaluate the genetic diversity of a species via investigation of its entire genome repertoire. Although a recent wave of pan-genomic studies has shed new light on crop diversity and improvement using advanced sequencing technology, the potential applications of crop pan-genomics in crop improvement are yet to be fully exploited. In this review, we highlight the progress achieved in understanding crop pan?genomics, discuss biological activities that cause SVs, review important agronomical traits affected by SVs, and present our perspective on the application of pan-genomics in crop improvement.展开更多
Plant genomes are so highly diverse that a substantial proportion of genomic sequences are not shared among individuals.The variable DNA sequences,along with the conserved core sequences,compose the more sophisticated...Plant genomes are so highly diverse that a substantial proportion of genomic sequences are not shared among individuals.The variable DNA sequences,along with the conserved core sequences,compose the more sophisticated pan-genome that represents the collection of all non-redundant DNA in a species.With rapid progress in genome sequencing technologies,pan-genome research in plants is now accelerating.Here we review recent advances in plant pan-genomics,including major driving forces of structural variations that constitute the variable sequences,methodological innovations for representing the pan-genome,and major successes in constructing plant pan-genomes.We also summarize recent efforts toward decoding the remaining dark matter in telomere-to-telomere or gapless plant genomes.These new genome resources,which have remarkable advantages over numerous previously assembled less-than-perfect genomes,are expected to become new references for genetic studies and plant breeding.展开更多
Pigs were domesticated independently in the Near East and China,indicating that a single reference genome from one individual is unable to represent the full spectrum of divergent sequences in pigs worldwide.Therefore...Pigs were domesticated independently in the Near East and China,indicating that a single reference genome from one individual is unable to represent the full spectrum of divergent sequences in pigs worldwide.Therefore,12 de novo pig assemblies from Eurasia were compared in this study to identify the missing sequences from the reference genome.As a result,72.5 Mb of nonredundant sequences(~3% of the genome)were found to be absent from the reference genome(Sscrofa11.1)and were defined as pan-sequences.Of the pan-sequences,9.0 Mb were dominant in Chinese pigs,in contrast with their low frequency in European pigs.One sequence dominant in Chinese pigs contained the complete genic region of the tazarotene-induced gene 3(TIG3)gene which is involved in fatty acid metabolism.Using flanking sequences and Hi-C based methods,27.7% of the sequences could be anchored to the reference genome.The supplementation of these sequences could contribute to the accurate interpretation of the 3D chromatin structure.A web-based pan-genome database was further provided to serve as a primary resource for exploration of genetic diversity and promote pig breeding and biomedical research.展开更多
Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between mul...Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between multiple gene and sequence copies, and in genetic mapping, hindering use of genomic data for genetics and breeding. Polyploid genomes may also be more prone to containing structural variation, such as loss of gene copies or sequences(presence–absence variation) and the presence of genes or sequences in multiple copies(copynumber variation). Although the two main types of genomic structural variation commonly identified are presence–absence variation and copy-number variation, we propose that homeologous exchanges constitute a third major form of genomic structural variation in polyploids. Homeologous exchanges involve the replacement of one genomic segment by a similar copy from another genome or ancestrally duplicated region, and are known to be extremely common in polyploids. Detecting all kinds of genomic structural variation is challenging, but recent advances such as optical mapping and long-read sequencing offer potential strategies to help identify structural variants even in complex polyploid genomes. All three major types of genomic structural variation(presence–absence, copy-number, and homeologous exchange) are now known to influence phenotypes in crop plants, with examples of flowering time, frost tolerance, and adaptive and agronomic traits. In this review,we summarize the challenges of genome analysis in polyploid crops, describe the various types of genomic structural variation and the genomics technologies and data that can be used to detect them, and collate information produced to date related to the impact of genomic structural variation on crop phenotypes. We highlight the importance of genomic structural variation for the future genetic improvement of polyploid crops.展开更多
Cultivated potato is a clonally propagated autotetraploid species with a highly heterogeneous genome.Phased assemblies of six cultivars including two chromosome-scale phased genome assemblies revealed extensive alleli...Cultivated potato is a clonally propagated autotetraploid species with a highly heterogeneous genome.Phased assemblies of six cultivars including two chromosome-scale phased genome assemblies revealed extensive allelic diversity,including altered coding and transcript sequences,preferential allele expression,and structural variation that collectively result in a highly complex transcriptome and predicted proteome,which are distributed across the homologous chromosomes.Wild species contribute to the extensive allelic diversity in tetraploid cultivars,demonstrating ancestral introgressions predating modern breeding efforts.As a clonally propagated autotetraploid that undergoes limited meiosis,dysfunctional and deleterious alleles are not purged in tetraploid potato.Nearly a quarter of the loci bore mutations are predicted to have a high negative impact on protein function,complicating breeder’s efforts to reduce genetic load.The StCDF1 locus controls maturity,and analysis of six tetraploid genomes revealed that 12 allelic variants of StCDF1 are correlated with maturity in a dosage-dependent manner.Knowledge of the complexity of the tetraploid potato genome with its rampant structural variation and embedded deleterious and dysfunctional alleles will be key not only to implementing precision breeding of tetraploid cultivars but also to the construction of homozygous,diploid potato germplasm containing favorable alleles to capitalize on heterosis in F1 hybrids.展开更多
Structural variations(SVs)have long been described as being involved in the origin,adaption,and domes-tication of species.However,the underlying genetic and genomic mechanisms are poorly understood.Here,we report a hi...Structural variations(SVs)have long been described as being involved in the origin,adaption,and domes-tication of species.However,the underlying genetic and genomic mechanisms are poorly understood.Here,we report a high-quality genome assembly of Gossypium barbadense acc.Tanguis,a landrace that is closely related to formation of extra-long-staple(ELS)cultivated cotton.An SV-based pan-genome(Pan-SV)was then constructed using a total of 182593 non-redundant SVs,including 2236 inversions,97398 insertions,and 82959 deletions from 11 assembled genomes of allopolyploid cotton.The utility of this Pan-sV was then demonstrated through population structure analysis and genome-wide association studies(GWASs).Using segregation mapping populations produced through crossing ELS cotton and the landrace along with an Sv-based GWAs,certain SVs responsible for speciation,domestication,and improvement in tetraploid cottons were identified.Importantly,some of the SVs presently identified as associated with the yield and fiber quality improvement had not been identified in previous SNP-based GWAS.In particular,a 9-bp insertion or deletion was found to associate with elimination of the interspecific reproductive isolation between Gossypium hirsutum and G.barbadense.Collectively,this study provides new insights into genome-wide,gene-scale SVs linked to important agronomic traits in a major crop spe-cies and highlights the importance of sVs during the speciation,domestication,and improvement of culti-vated crop species.展开更多
Background Domestic goose breeds are descended from either the Swan goose(Anser cygnoides)or the Greylag goose(Anser anser),exhibiting variations in body size,reproductive performance,egg production,feather color,and ...Background Domestic goose breeds are descended from either the Swan goose(Anser cygnoides)or the Greylag goose(Anser anser),exhibiting variations in body size,reproductive performance,egg production,feather color,and other phenotypic traits.Constructing a pan-genome facilitates a thorough identification of genetic variations,thereby deepening our comprehension of the molecular mechanisms underlying genetic diversity and phenotypic variability.Results To comprehensively facilitate population genomic and pan-genomic analyses in geese,we embarked on the task of 659 geese whole genome resequencing data and compiling a database of 155 RNA-seq samples.By constructing the pan-genome for geese,we generated non-reference contigs totaling 612 Mb,unveiling a collection of 2,813 novel genes and pinpointing 15,567 core genes,1,324 softcore genes,2,734 shell genes,and 878 cloud genes in goose genomes.Furthermore,we detected an 81.97 Mb genomic region showing signs of genome selection,encompassing the TGFBR2 gene correlated with variations in body weight among geese.Genome-wide association studies utilizing single nucleotide polymorphisms(SNPs)and presence-absence variation revealed significant genomic associations with various goose meat quality,reproductive,and body composition traits.For instance,a gene encoding the SVEP1 protein was linked to carcass oblique length,and a distinct gene-CDS haplotype of the SVEP1 gene exhibited an association with carcass oblique length.Notably,the pan-genome analysis revealed enrichment of variable genes in the“hair follicle maturation”Gene Ontology term,potentially linked to the selection of feather-related traits in geese.A gene presence-absence variation analysis suggested a reduced frequency of genes associated with“regulation of heart contraction”in domesticated geese compared to their wild counterparts.Our study provided novel insights into gene expression features and functions by integrating gene expression patterns across multiple organs and tissues in geese and analyzing po展开更多
The domestication of Brassica oleracea has resulted in diverse morphological types with distinct patterns of organ development.Here we report a graph-based pan-genome of B.oleracea constructed from high-quality genome...The domestication of Brassica oleracea has resulted in diverse morphological types with distinct patterns of organ development.Here we report a graph-based pan-genome of B.oleracea constructed from high-quality genome assemblies of different morphotypes.The pan-genome harbors over 200 structural variant hotspot regions enriched in auxin-andflowering-related genes.Population genomic analyses revealed that early domestication of B.oleracea focused on leaf or stem development.Geneflows resulting from agricultural practices and variety improvement were detected among different morphotypes.Selective-sweep and pan-genome analyses identified an auxin-responsive small auxin up-regulated RNA gene and a CLAV-ATA3/ESR-RELATED family gene as crucial players in leaf–stem differentiation during the early stage of B.oleracea domestication and the BoKAN1 gene as instrumental in shaping the leafy heads of cabbage and Brussels sprouts.Our pan-genome and functional analyses further revealed that variations in the BoFLC2 gene play key roles in the divergence of vernalization andflowering characteristics among different morphotypes,and variations in thefirst intron of BoFLC3 are involved infine-tuning theflowering process in cauliflower.This study provides a comprehensive understanding of the pan-genome of B.oleracea and sheds light on the domestication and differential organ development of this globally important crop species.展开更多
Rice(Oryza sativa)is a significant crop worldwide with a genome shaped by various evolutionary factors.Rice centromeres are crucial for chromosome segregation,and contain some unreported genes.Due to the diverse and c...Rice(Oryza sativa)is a significant crop worldwide with a genome shaped by various evolutionary factors.Rice centromeres are crucial for chromosome segregation,and contain some unreported genes.Due to the diverse and complex centromere region,a comprehensive understanding of rice centromere structure and function at the population level is needed.We constructed a high-quality centromere map based on the rice super pangenome consisting of a 251-accession panel comprising both cultivated and wild species of Asian and African rice.We showed that rice centromeres have diverse satellite repeat CentO,which vary across chromosomes and subpopulations,reflecting their distinct evolutionary patterns.We also revealed that long terminal repeats(LTRs),especially young Gypsy-type LTRs,are abundant in the peripheral CentO-enriched regions and drive rice centromere expansion and evolution.Furthermore,high-quality genome assembly and complete telomere-to-telomere(T2T)reference genome enable us to obtain more centromeric genome information despite mapping and cloning of centromere genes being challenging.We investigated the association between structural variations and gene expression in the rice centromere.A centromere gene,OsMAB,which positively regulates rice tiller number,was further confirmed by expression quantitative trait loci,haplotype analysis and clustered regularly interspaced palindromic repeats(CRISPR)/CRISPR-associated protein9 methods.By revealing the new insights into the evolutionary patterns and biological roles of rice centromeres,our finding will facilitate future research on centromere biology and crop improvement.展开更多
文摘Genetic variation ranging from single-nucleotide polymorphisms to large structural variants (SVs) can cause variation of gene content among individuals within the same species. There is an increasing appreciation that a single reference genome is insufficient to capture the full landscape of genetic diversity of a species. Pan-genome analysis offers a platform to evaluate the genetic diversity of a species via investigation of its entire genome repertoire. Although a recent wave of pan-genomic studies has shed new light on crop diversity and improvement using advanced sequencing technology, the potential applications of crop pan-genomics in crop improvement are yet to be fully exploited. In this review, we highlight the progress achieved in understanding crop pan?genomics, discuss biological activities that cause SVs, review important agronomical traits affected by SVs, and present our perspective on the application of pan-genomics in crop improvement.
基金National Natural Science Foundation of China(31825015 to X.H.31901596 to J.S.)Young Elite Scientists Sponsorship Program by CAST(2021QNRC001 toJ.S.).
文摘Plant genomes are so highly diverse that a substantial proportion of genomic sequences are not shared among individuals.The variable DNA sequences,along with the conserved core sequences,compose the more sophisticated pan-genome that represents the collection of all non-redundant DNA in a species.With rapid progress in genome sequencing technologies,pan-genome research in plants is now accelerating.Here we review recent advances in plant pan-genomics,including major driving forces of structural variations that constitute the variable sequences,methodological innovations for representing the pan-genome,and major successes in constructing plant pan-genomes.We also summarize recent efforts toward decoding the remaining dark matter in telomere-to-telomere or gapless plant genomes.These new genome resources,which have remarkable advantages over numerous previously assembled less-than-perfect genomes,are expected to become new references for genetic studies and plant breeding.
基金supported by the National Natural Science Foundation of China(31822052 and 31572381)the Science&Technology Support Program of Sichuan(2016NYZ0042 and 2017NZDZX0002)。
文摘Pigs were domesticated independently in the Near East and China,indicating that a single reference genome from one individual is unable to represent the full spectrum of divergent sequences in pigs worldwide.Therefore,12 de novo pig assemblies from Eurasia were compared in this study to identify the missing sequences from the reference genome.As a result,72.5 Mb of nonredundant sequences(~3% of the genome)were found to be absent from the reference genome(Sscrofa11.1)and were defined as pan-sequences.Of the pan-sequences,9.0 Mb were dominant in Chinese pigs,in contrast with their low frequency in European pigs.One sequence dominant in Chinese pigs contained the complete genic region of the tazarotene-induced gene 3(TIG3)gene which is involved in fatty acid metabolism.Using flanking sequences and Hi-C based methods,27.7% of the sequences could be anchored to the reference genome.The supplementation of these sequences could contribute to the accurate interpretation of the 3D chromatin structure.A web-based pan-genome database was further provided to serve as a primary resource for exploration of genetic diversity and promote pig breeding and biomedical research.
基金supported by the Deutsche Forschungsgemeinschaft(MA6473/1-1,MA6473/2-1)
文摘Many of our major crop species are polyploids, containing more than one genome or set of chromosomes. Polyploid crops present unique challenges, including difficulties in genome assembly, in discriminating between multiple gene and sequence copies, and in genetic mapping, hindering use of genomic data for genetics and breeding. Polyploid genomes may also be more prone to containing structural variation, such as loss of gene copies or sequences(presence–absence variation) and the presence of genes or sequences in multiple copies(copynumber variation). Although the two main types of genomic structural variation commonly identified are presence–absence variation and copy-number variation, we propose that homeologous exchanges constitute a third major form of genomic structural variation in polyploids. Homeologous exchanges involve the replacement of one genomic segment by a similar copy from another genome or ancestrally duplicated region, and are known to be extremely common in polyploids. Detecting all kinds of genomic structural variation is challenging, but recent advances such as optical mapping and long-read sequencing offer potential strategies to help identify structural variants even in complex polyploid genomes. All three major types of genomic structural variation(presence–absence, copy-number, and homeologous exchange) are now known to influence phenotypes in crop plants, with examples of flowering time, frost tolerance, and adaptive and agronomic traits. In this review,we summarize the challenges of genome analysis in polyploid crops, describe the various types of genomic structural variation and the genomics technologies and data that can be used to detect them, and collate information produced to date related to the impact of genomic structural variation on crop phenotypes. We highlight the importance of genomic structural variation for the future genetic improvement of polyploid crops.
基金We acknowledge financial support from AAFC-Genome Quebec GQAAC-2019-2 to M.V.S.,Agriculture and Agri-Food Canada Genomics Research and Development Initiative grant number J-002367 to H.H.T.and K.M.G.Compute Canada,Research Portals and Platforms(RPP)award to M.V.S,Compute Canada,Resources for Research Groups(RRG)award to M.V.S.,DFG Germany’s Excellence Strategy(EXC2048/1-Project 390686111)to B.U.,Dutch TKI top-sector project Novel genetic and genomic tools for polyploid crops(project numbers BO26.03-009-004 and BO-50-002-022)to P.M.B.,European Union’s Horizon 2020 research and innovation programme under grant agreement no 862858(ADAPT)to C.W.B.B.,Germany Ministry of Education and Research BMBF FKZ031A536C to B.U.,Germany Ministry of Education and Research BMBF FKZ031A536C to M.E.B.,GIZ on behalf of the of the Federal Ministry for Economic Cooperation and Development,Germany to D.Ellis and N.L.A.,National Science Foundation(IOS 2140176)to C.R.B.+5 种基金National Science Foundation NRT-IMPACTS fellowship(1828149)to N.B.,NC Agricultural Research Service to G.C.Y.,NC State University to G.C.Y.,NSF IOS-1929982 to C.R.B.NWO-domein Toegepaste en Technische Wetenschappen MAMY project ID 16889 to C.W.B.B.and N.L.,Potato Variety Management Institute to K.V.and V.S.,State of Minnesota,Minnesota Department of Agriculture to L.M.S.,the United States-Israel Binational Agricultural Research and Development Funds IS-5038-17C and IS-5317-20C to J.J.Texas A&M University to M.I.V.,The Clen P.and Emma L.Atchley Potato Research Faculty Excellence Endowment to J.C.K.,University of Maine to E.H.T.,USDA AFRI NIFA Pre-doctoral Fellowship project 2019-07160 to N.R.K.,USDA Multistate Research Funds accession 1004958 to W.S.D.J.USDA Hatch Act 2019-03162 to C.R.B.,USDA NIFA 2020-67034-31731 to G.H.USDA-NIFA 2016-34141-25707 to L.M.S.,USDA-NIFA-SCRI 2019-51181-30021 to L.M.S.,Dutch TKI top-sector project Genetics Assisted Assembly of Complex Genomes(project number BO-68-001-033-WPR)LWV20.112 Application of sequence-based multi-allelic marker
文摘Cultivated potato is a clonally propagated autotetraploid species with a highly heterogeneous genome.Phased assemblies of six cultivars including two chromosome-scale phased genome assemblies revealed extensive allelic diversity,including altered coding and transcript sequences,preferential allele expression,and structural variation that collectively result in a highly complex transcriptome and predicted proteome,which are distributed across the homologous chromosomes.Wild species contribute to the extensive allelic diversity in tetraploid cultivars,demonstrating ancestral introgressions predating modern breeding efforts.As a clonally propagated autotetraploid that undergoes limited meiosis,dysfunctional and deleterious alleles are not purged in tetraploid potato.Nearly a quarter of the loci bore mutations are predicted to have a high negative impact on protein function,complicating breeder’s efforts to reduce genetic load.The StCDF1 locus controls maturity,and analysis of six tetraploid genomes revealed that 12 allelic variants of StCDF1 are correlated with maturity in a dosage-dependent manner.Knowledge of the complexity of the tetraploid potato genome with its rampant structural variation and embedded deleterious and dysfunctional alleles will be key not only to implementing precision breeding of tetraploid cultivars but also to the construction of homozygous,diploid potato germplasm containing favorable alleles to capitalize on heterosis in F1 hybrids.
基金supported in part by the 2021 Research Program of Sanya Yazhou Bay Science and Technology City(SKJC-2021-02-001)the Leading Innovative and Entrepreneur Team Introduction Program of Zhejiang(2019R01002)the Fundamental Research Funds for the Central Universities(226-2022-00100 and 2022QZJH43).
文摘Structural variations(SVs)have long been described as being involved in the origin,adaption,and domes-tication of species.However,the underlying genetic and genomic mechanisms are poorly understood.Here,we report a high-quality genome assembly of Gossypium barbadense acc.Tanguis,a landrace that is closely related to formation of extra-long-staple(ELS)cultivated cotton.An SV-based pan-genome(Pan-SV)was then constructed using a total of 182593 non-redundant SVs,including 2236 inversions,97398 insertions,and 82959 deletions from 11 assembled genomes of allopolyploid cotton.The utility of this Pan-sV was then demonstrated through population structure analysis and genome-wide association studies(GWASs).Using segregation mapping populations produced through crossing ELS cotton and the landrace along with an Sv-based GWAs,certain SVs responsible for speciation,domestication,and improvement in tetraploid cottons were identified.Importantly,some of the SVs presently identified as associated with the yield and fiber quality improvement had not been identified in previous SNP-based GWAS.In particular,a 9-bp insertion or deletion was found to associate with elimination of the interspecific reproductive isolation between Gossypium hirsutum and G.barbadense.Collectively,this study provides new insights into genome-wide,gene-scale SVs linked to important agronomic traits in a major crop spe-cies and highlights the importance of sVs during the speciation,domestication,and improvement of culti-vated crop species.
基金funding from several sources,including the Chongqing Scientific Research Institution Performance Incentive Project(grant number cstc2022jxjl80007)the Earmarked Fund for China Agriculture Research System(grant number CARS-42-51)+5 种基金the Chongqing Scientific Research Institution Performance Incentive Project(grant number 22527 J)the Key R&D Project in Agriculture and Animal Husbandry of Rongchang(grant number No.22534C-22)Natural Science Foundation of Chongqing Project,grant number CSTB2022NSCQ-MSX0434Natural Science Foundation of Sichuan Project,grant number 2022NSFSC0605Natural Science Foundation of Sichuan Project,grant number 2021YFS0379the Chongqing Technology Innovation and Application Development Project(grant number No.cstc2021ycjh-bgzxm0248)。
文摘Background Domestic goose breeds are descended from either the Swan goose(Anser cygnoides)or the Greylag goose(Anser anser),exhibiting variations in body size,reproductive performance,egg production,feather color,and other phenotypic traits.Constructing a pan-genome facilitates a thorough identification of genetic variations,thereby deepening our comprehension of the molecular mechanisms underlying genetic diversity and phenotypic variability.Results To comprehensively facilitate population genomic and pan-genomic analyses in geese,we embarked on the task of 659 geese whole genome resequencing data and compiling a database of 155 RNA-seq samples.By constructing the pan-genome for geese,we generated non-reference contigs totaling 612 Mb,unveiling a collection of 2,813 novel genes and pinpointing 15,567 core genes,1,324 softcore genes,2,734 shell genes,and 878 cloud genes in goose genomes.Furthermore,we detected an 81.97 Mb genomic region showing signs of genome selection,encompassing the TGFBR2 gene correlated with variations in body weight among geese.Genome-wide association studies utilizing single nucleotide polymorphisms(SNPs)and presence-absence variation revealed significant genomic associations with various goose meat quality,reproductive,and body composition traits.For instance,a gene encoding the SVEP1 protein was linked to carcass oblique length,and a distinct gene-CDS haplotype of the SVEP1 gene exhibited an association with carcass oblique length.Notably,the pan-genome analysis revealed enrichment of variable genes in the“hair follicle maturation”Gene Ontology term,potentially linked to the selection of feather-related traits in geese.A gene presence-absence variation analysis suggested a reduced frequency of genes associated with“regulation of heart contraction”in domesticated geese compared to their wild counterparts.Our study provided novel insights into gene expression features and functions by integrating gene expression patterns across multiple organs and tissues in geese and analyzing po
基金supported by grants from the National Key Research and Development Program of China (2022YFF1003001)the National Natural Science Foundation of China (32072576)+3 种基金the National Modern Agriculture Industry Technology System (CARS-23-G42)the Jiangsu Provincial Key Research and Development Program (BE2021376)the Innovation Program of the Beijing Academy of Agricultural and Forestry Sciences (KJCX20230121)the Collaborative Innovation Program for Leafy and Root Vegetables of the Beijing Vegetable Research Center,Beijing Academy of Agricultural and Forestry Sciences (XTCX202302).
文摘The domestication of Brassica oleracea has resulted in diverse morphological types with distinct patterns of organ development.Here we report a graph-based pan-genome of B.oleracea constructed from high-quality genome assemblies of different morphotypes.The pan-genome harbors over 200 structural variant hotspot regions enriched in auxin-andflowering-related genes.Population genomic analyses revealed that early domestication of B.oleracea focused on leaf or stem development.Geneflows resulting from agricultural practices and variety improvement were detected among different morphotypes.Selective-sweep and pan-genome analyses identified an auxin-responsive small auxin up-regulated RNA gene and a CLAV-ATA3/ESR-RELATED family gene as crucial players in leaf–stem differentiation during the early stage of B.oleracea domestication and the BoKAN1 gene as instrumental in shaping the leafy heads of cabbage and Brussels sprouts.Our pan-genome and functional analyses further revealed that variations in the BoFLC2 gene play key roles in the divergence of vernalization andflowering characteristics among different morphotypes,and variations in thefirst intron of BoFLC3 are involved infine-tuning theflowering process in cauliflower.This study provides a comprehensive understanding of the pan-genome of B.oleracea and sheds light on the domestication and differential organ development of this globally important crop species.
基金supported by the National Natural Science Foundation of China(32188102,32372148)Innovation Program of Chinese Academy of Agricultural Sciences,the Youth Innovation of Chinese Academy of Agricultural Sciences(Y20230C36)+1 种基金Guangdong Basic and Applied Basic Research Foundation(2023B1515020053)the Youth Program of Guangdong Basic and Applied Research(2021A1515111123)。
文摘Rice(Oryza sativa)is a significant crop worldwide with a genome shaped by various evolutionary factors.Rice centromeres are crucial for chromosome segregation,and contain some unreported genes.Due to the diverse and complex centromere region,a comprehensive understanding of rice centromere structure and function at the population level is needed.We constructed a high-quality centromere map based on the rice super pangenome consisting of a 251-accession panel comprising both cultivated and wild species of Asian and African rice.We showed that rice centromeres have diverse satellite repeat CentO,which vary across chromosomes and subpopulations,reflecting their distinct evolutionary patterns.We also revealed that long terminal repeats(LTRs),especially young Gypsy-type LTRs,are abundant in the peripheral CentO-enriched regions and drive rice centromere expansion and evolution.Furthermore,high-quality genome assembly and complete telomere-to-telomere(T2T)reference genome enable us to obtain more centromeric genome information despite mapping and cloning of centromere genes being challenging.We investigated the association between structural variations and gene expression in the rice centromere.A centromere gene,OsMAB,which positively regulates rice tiller number,was further confirmed by expression quantitative trait loci,haplotype analysis and clustered regularly interspaced palindromic repeats(CRISPR)/CRISPR-associated protein9 methods.By revealing the new insights into the evolutionary patterns and biological roles of rice centromeres,our finding will facilitate future research on centromere biology and crop improvement.