Evidence of whole-genome duplications(WGDs)and subsequent karyotype changes has been detected in most major lineages of living organisms on Earth.To clarify the complex resulting multi-layered patterns of gene colline...Evidence of whole-genome duplications(WGDs)and subsequent karyotype changes has been detected in most major lineages of living organisms on Earth.To clarify the complex resulting multi-layered patterns of gene collinearity in genome analyses,there is a need for convenient and accurate toolkits.To meet this need,we developed WGDI(Whole-Genome Duplication Integrated analysis),a Python-based command-line tool that facilitates comprehensive analysis of recursive polyploidization events and cross-species genome alignments.WGDI supports three main workflows(polyploid inference,hierarchical inference of genomic homology,and ancestral chromosome karyotyping)that can improve the detection of WGD and characterization of WGD-related events based on high-quality chromosome-level genomes.Significantly,it can extract complete synteny blocks and facilitate reconstruction of detailed karyotype evolution.This toolkit is freely available at GitHub(https://github.com/SunPengChuan/wgdi).As an example of its application,WGDI convincingly clarified karyotype evolution in Aquilegia coerulea and Vitis vinifera following WGDs and rejected the hypothesis that Aquilegia contributed as a parental lineage to the allopolyploid origin of core dicots.展开更多
Plant genome sequencing has dramatically increased,and some species even have multiple high-quality reference versions.Demands for clade-specific homology inference and analysis have increased in the pangenomic era.He...Plant genome sequencing has dramatically increased,and some species even have multiple high-quality reference versions.Demands for clade-specific homology inference and analysis have increased in the pangenomic era.Here we present a novel method,GeneTribe(https://chenym1.github.io/genetribe/),for homology inference among genetically similar genomes that incorporates gene collinearity and shows bet-ter performance than traditional sequence-similarity-based methods in terms of accuracy and scalability.The Triticeae tribe is a typical allopolyploid-rich clade with complex species relationships that includes many important crops,such as wheat,barley,and rye.We built Triticeae-GeneTribe(http://wheat.cau.edu.cn/TGT/),a homology database,by integrating 12 Triticeae genomes and 3 outgroup model genomes and implemented versatile analysis and visualization functions.With macrocollinearity analysis,we were able to construct a refined model illustrating the structural rearrangements of the 4A-5A-7B chromosomes in wheat as two major translocation events.With collinearity analysis at both the macro-and microscale,we illustrated the complex evolutionary history of homologs of the wheat vernalization gene Vm2,which evolved as a combined result of genome translocation,duplication,and polyploidization and gene loss events.Our work provides a useful practice for connecting emerging genome assemblies,with awareness of the extensive polyploidy in plants,and will help researchers efficiently exploit genome sequence re-sources.展开更多
Brassinosteroids(BRs), which are essential phytohormones for plant growth and development, are important for cotton fiber development. Additionally, BES1 transcription factors are critical for BR signal transduction. ...Brassinosteroids(BRs), which are essential phytohormones for plant growth and development, are important for cotton fiber development. Additionally, BES1 transcription factors are critical for BR signal transduction. However, cotton BES1 family genes have not been comprehensively characterized. In this study, we identified 11 BES1 genes in G. arboreum, 11 in G.raimondii, 16 in G. barbadense, and 22 in G. hirsutum. The BES1 sequences were significantly conserved in the Arabidopsis thaliana, rice, and upland cotton genomes. A total of 94 BES1 genes from 10 different plant species were divided into three clades according to the neighbor-joining and minimum-evolution methods. Moreover, the exon/intron patterns and motif distributions were highly conserved among the A. thaliana and cotton BES1 genes. The collinearity among the orthologs from the At and Dt subgenomes was estimated. Segmental duplications in the At and Dt subgenomes were primarily responsible for the expansion of the cotton BES1 gene family. Of the GhBES1 genes, GhBES1.4_At/Dt exhibited BL-induced expression and was predominantly expressed in fibers. Furthermore, Col-0/mGhBES1.4_At plants produced curled leaves with long and bent petioles. These transgenic plants also exhibited decreased hypocotyl sensitivity to brassinazole and constitutive BR induced/repressed gene expression patterns. The constitutive BR responses of the plants overexpressing mGhBES1.4_At were similar to those of the bes1-D mutant.展开更多
To overcome the shortcomings of model-driven state estimation methods, this paper proposes a data-driven robust state estimation (DDSE) method through off-line learning and on-line matching. At the off-line learning s...To overcome the shortcomings of model-driven state estimation methods, this paper proposes a data-driven robust state estimation (DDSE) method through off-line learning and on-line matching. At the off-line learning stage, a linear regression equation is presented by clustering historical data from supervisory control and data acquisition (SCADA), which provides a guarantee for solving the over-learning problem of the existing DDSE methods;then a novel robust state estimation method that can be transformed into quadratic programming (QP) models is proposed to obtain the mapping relationship between the measurements and the state variables (MRBMS). The proposed QP models can well solve the problem of collinearity in historical data. Furthermore, the off-line learning stage is greatly accelerated from three aspects including reducing historical categories, constructing tree retrieval structure for known topologies, and using sensitivity analysis when solving QP models. At the on-line matching stage, by quickly matching the current snapshot with the historical ones, the corresponding MRBMS can be obtained, and then the estimation values of the state variables can be obtained. Simulations demonstrate that the proposed DDSE method has obvious advantages in terms of suppressing over-learning problems, dealing with collinearity problems, robustness, and computation efficiency.展开更多
Background:INDETERMINATE DOMAIN(IDD)transcription factors form one of the largest and most conserved gene families in plant kingdom and play important roles in various processes of plant growth and development,such as...Background:INDETERMINATE DOMAIN(IDD)transcription factors form one of the largest and most conserved gene families in plant kingdom and play important roles in various processes of plant growth and development,such as flower induction in term of flowering control.Till date,systematic and functional analysis of IDD genes remained infancy in cotton.Results:In this study,we identified total of 162 IDD genes from eight different plant species including 65 IDD genes in Gossypium hirsutum.Phylogenetic analysis divided IDDs genes into seven well distinct groups.The gene structures and conserved motifs of GhIDD genes depicted highly conserved exon-intron and protein motif distribution patterns.Gene duplication analysis revealed that among 142 orthologous gene pairs,54 pairs have been derived by segmental duplication events and four pairs by tandem duplication events.Further,Ka/Ks values of most of orthologous/paralogous gene pairs were less than one suggested the purifying selection pressure during evolution.Spatiotemporal expression pattern by qRT-PCR revealed that most of the investigated GhIDD genes showed higher transcript levels in ovule of seven days post anthesis,and upregulated response under the treatments of multiple abiotic stresses.Conclusions:Evolutionary analysis revealed that IDD gene family was highly conserved in plant during the rapid phase of evolution.Whole genome duplication,segmental as well as tandem duplication significantly contributed to the expansion of IDD gene family in upland cotton.Some distinct genes evolved into special subfamily and indicated potential role in the allotetraploidy Gossypium hisutum evolution and development High transcript levels of GhIDD genes in ovules illustrated their potential roles in seed and fiber development Further,upregulated responses of GhIDD genes under the treatments of various abiotic stresses suggested them as important genetic regulators to improve stress resistance in cotton breeding.展开更多
基金This work was supported equally by the Strategic Priority Research Program of the Chinese Academy of Sciences(XDB31000000)the National Natural Science Foundation of China(grant numbers 31590821 and 91731301 to J.L.and 32070669to X.W.)+1 种基金the National Key Research and Development Program of China(2017YFC0505203 to Z.X.)also by the Fundamental Research Funds for the Central Universities(SCU2019D013 and 2020SCUNL207)and theNational High-Level Talents Special Support Plan(10 Thousand People Plan)。
文摘Evidence of whole-genome duplications(WGDs)and subsequent karyotype changes has been detected in most major lineages of living organisms on Earth.To clarify the complex resulting multi-layered patterns of gene collinearity in genome analyses,there is a need for convenient and accurate toolkits.To meet this need,we developed WGDI(Whole-Genome Duplication Integrated analysis),a Python-based command-line tool that facilitates comprehensive analysis of recursive polyploidization events and cross-species genome alignments.WGDI supports three main workflows(polyploid inference,hierarchical inference of genomic homology,and ancestral chromosome karyotyping)that can improve the detection of WGD and characterization of WGD-related events based on high-quality chromosome-level genomes.Significantly,it can extract complete synteny blocks and facilitate reconstruction of detailed karyotype evolution.This toolkit is freely available at GitHub(https://github.com/SunPengChuan/wgdi).As an example of its application,WGDI convincingly clarified karyotype evolution in Aquilegia coerulea and Vitis vinifera following WGDs and rejected the hypothesis that Aquilegia contributed as a parental lineage to the allopolyploid origin of core dicots.
基金the Major Program of the National Natural Science Foundation of China(grant no.31991210)to Q.S.and by the National Natural Science Foundation of China(grant no.31701415)to W.G.
文摘Plant genome sequencing has dramatically increased,and some species even have multiple high-quality reference versions.Demands for clade-specific homology inference and analysis have increased in the pangenomic era.Here we present a novel method,GeneTribe(https://chenym1.github.io/genetribe/),for homology inference among genetically similar genomes that incorporates gene collinearity and shows bet-ter performance than traditional sequence-similarity-based methods in terms of accuracy and scalability.The Triticeae tribe is a typical allopolyploid-rich clade with complex species relationships that includes many important crops,such as wheat,barley,and rye.We built Triticeae-GeneTribe(http://wheat.cau.edu.cn/TGT/),a homology database,by integrating 12 Triticeae genomes and 3 outgroup model genomes and implemented versatile analysis and visualization functions.With macrocollinearity analysis,we were able to construct a refined model illustrating the structural rearrangements of the 4A-5A-7B chromosomes in wheat as two major translocation events.With collinearity analysis at both the macro-and microscale,we illustrated the complex evolutionary history of homologs of the wheat vernalization gene Vm2,which evolved as a combined result of genome translocation,duplication,and polyploidization and gene loss events.Our work provides a useful practice for connecting emerging genome assemblies,with awareness of the extensive polyploidy in plants,and will help researchers efficiently exploit genome sequence re-sources.
基金supported by the National Natural Science Foundation of China (31501345)Young Elite Scientist Sponsorship Program by CAST (China Association for Science and Technology)
文摘Brassinosteroids(BRs), which are essential phytohormones for plant growth and development, are important for cotton fiber development. Additionally, BES1 transcription factors are critical for BR signal transduction. However, cotton BES1 family genes have not been comprehensively characterized. In this study, we identified 11 BES1 genes in G. arboreum, 11 in G.raimondii, 16 in G. barbadense, and 22 in G. hirsutum. The BES1 sequences were significantly conserved in the Arabidopsis thaliana, rice, and upland cotton genomes. A total of 94 BES1 genes from 10 different plant species were divided into three clades according to the neighbor-joining and minimum-evolution methods. Moreover, the exon/intron patterns and motif distributions were highly conserved among the A. thaliana and cotton BES1 genes. The collinearity among the orthologs from the At and Dt subgenomes was estimated. Segmental duplications in the At and Dt subgenomes were primarily responsible for the expansion of the cotton BES1 gene family. Of the GhBES1 genes, GhBES1.4_At/Dt exhibited BL-induced expression and was predominantly expressed in fibers. Furthermore, Col-0/mGhBES1.4_At plants produced curled leaves with long and bent petioles. These transgenic plants also exhibited decreased hypocotyl sensitivity to brassinazole and constitutive BR induced/repressed gene expression patterns. The constitutive BR responses of the plants overexpressing mGhBES1.4_At were similar to those of the bes1-D mutant.
基金This work was supported in part by National Natural Science Foundation of China(No.52077076)in part by the State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources(No.LAPS2021-18).
文摘To overcome the shortcomings of model-driven state estimation methods, this paper proposes a data-driven robust state estimation (DDSE) method through off-line learning and on-line matching. At the off-line learning stage, a linear regression equation is presented by clustering historical data from supervisory control and data acquisition (SCADA), which provides a guarantee for solving the over-learning problem of the existing DDSE methods;then a novel robust state estimation method that can be transformed into quadratic programming (QP) models is proposed to obtain the mapping relationship between the measurements and the state variables (MRBMS). The proposed QP models can well solve the problem of collinearity in historical data. Furthermore, the off-line learning stage is greatly accelerated from three aspects including reducing historical categories, constructing tree retrieval structure for known topologies, and using sensitivity analysis when solving QP models. At the on-line matching stage, by quickly matching the current snapshot with the historical ones, the corresponding MRBMS can be obtained, and then the estimation values of the state variables can be obtained. Simulations demonstrate that the proposed DDSE method has obvious advantages in terms of suppressing over-learning problems, dealing with collinearity problems, robustness, and computation efficiency.
基金supported by the Major Research Plan of National Natural Science Foundation of China(NO.31690093)Creative Research Groups of China(31621005)the Agricultural Science and Technology Innovation Program Cooperation and Innovation Mission(CAAS-XTCX2016)
文摘Background:INDETERMINATE DOMAIN(IDD)transcription factors form one of the largest and most conserved gene families in plant kingdom and play important roles in various processes of plant growth and development,such as flower induction in term of flowering control.Till date,systematic and functional analysis of IDD genes remained infancy in cotton.Results:In this study,we identified total of 162 IDD genes from eight different plant species including 65 IDD genes in Gossypium hirsutum.Phylogenetic analysis divided IDDs genes into seven well distinct groups.The gene structures and conserved motifs of GhIDD genes depicted highly conserved exon-intron and protein motif distribution patterns.Gene duplication analysis revealed that among 142 orthologous gene pairs,54 pairs have been derived by segmental duplication events and four pairs by tandem duplication events.Further,Ka/Ks values of most of orthologous/paralogous gene pairs were less than one suggested the purifying selection pressure during evolution.Spatiotemporal expression pattern by qRT-PCR revealed that most of the investigated GhIDD genes showed higher transcript levels in ovule of seven days post anthesis,and upregulated response under the treatments of multiple abiotic stresses.Conclusions:Evolutionary analysis revealed that IDD gene family was highly conserved in plant during the rapid phase of evolution.Whole genome duplication,segmental as well as tandem duplication significantly contributed to the expansion of IDD gene family in upland cotton.Some distinct genes evolved into special subfamily and indicated potential role in the allotetraploidy Gossypium hisutum evolution and development High transcript levels of GhIDD genes in ovules illustrated their potential roles in seed and fiber development Further,upregulated responses of GhIDD genes under the treatments of various abiotic stresses suggested them as important genetic regulators to improve stress resistance in cotton breeding.