Gene set enrichment(GSE) analyses play an important role in the interpretation of large-scale transcriptome datasets. Multiple GSE tools can be integrated into a single method as obtaining optimal results is challen...Gene set enrichment(GSE) analyses play an important role in the interpretation of large-scale transcriptome datasets. Multiple GSE tools can be integrated into a single method as obtaining optimal results is challenging due to the plethora of GSE tools and their discrepant performances. Several existing ensemble methods lead to different scores in sorting pathways as integrated results; furthermore, it is difficult for users to choose a single ensemble score to obtain optimal final results. Here, we develop an ensemble method using a machine learning approach called Combined Gene set analysis incorporating Prioritization and Sensitivity(CGPS) that integrates the results provided by nine prominent GSE tools into a single ensemble score(R score) to sort pathways as integrated results. Moreover, to the best of our knowledge, CGPS is the first GSE ensemble method built based on a priori knowledge of pathways and phenotypes. Compared with 10 widely used individual methods and five types of ensemble scores from two ensemble methods, we demonstrate that sorting pathways based on the R score can better prioritize relevant pathways, as established by an evaluation of 120 simulated datasets and 45 real datasets.Additionally, CGPS is applied to expression data involving the drug panobinostat, which is an anticancer treatment against multiple myeloma. The results identify cell processes associated with cancer, such as the p53 signaling pathway(hsa04115); by contrast, according to two ensemble methods(EnrichmentBrowser and EGSEA), this pathway has a rank higher than 20, which may cause users to miss the pathway in their analyses. We show that this method, which is based on a priori knowledge, can capture valuable biological information from numerous types of gene set collections, such as KEGG pathways, GO terms, Reactome, and BioCarta. CGPS is publicly available as a standalone source code at ftp://ftp.cbi.pku.edu.cn/pub/CGPS_download/cgps-1.0.0.tar.gz.展开更多
AIM To clone core gene cDNA of Chinesehepatitis C virus(HCV)into eukaryoticexpression vector cosmid pTM3 and to expressHCV core antigen in HepG2 cells.METHODS Core gene cDNA of HCV wasintroduced into eukaryotic expres...AIM To clone core gene cDNA of Chinesehepatitis C virus(HCV)into eukaryoticexpression vector cosmid pTM3 and to expressHCV core antigen in HepG2 cells.METHODS Core gene cDNA of HCV wasintroduced into eukaryotic expression vectorcosmid pTM3.Using vaccinia virus/bacteriophage T7 hybrid expression system,HepG2 cells were transfected with therecombinant plasmid pTM3-Q534 by lipofectin.RESULTS From the transfected bacteriaTop10F’,2 pTM3-Q534 clones containing therecombinant plasmid were identified fromrandomly selected 10 ampicillin-resistantcolonies.By reverse transcription PCR andindirect immunofluorescence technique,HCVRNA and core protein was identified in HepG2cells transfected with the recombinant plasmid.CONCLUSION The construction of arecombinant plasmid and the expression of coregene cDNA of HCV in HepG2 was successful.展开更多
基金supported by the National Key Research and Development Program of China (2017YFC1201200,2017YFC0908404,2016YFC0901603,2016YFB0201700)National High-tech R&D Program of China (863 Program) (2015AA020108)the State Key Laboratory of Protein and Plant Gene Research
文摘Gene set enrichment(GSE) analyses play an important role in the interpretation of large-scale transcriptome datasets. Multiple GSE tools can be integrated into a single method as obtaining optimal results is challenging due to the plethora of GSE tools and their discrepant performances. Several existing ensemble methods lead to different scores in sorting pathways as integrated results; furthermore, it is difficult for users to choose a single ensemble score to obtain optimal final results. Here, we develop an ensemble method using a machine learning approach called Combined Gene set analysis incorporating Prioritization and Sensitivity(CGPS) that integrates the results provided by nine prominent GSE tools into a single ensemble score(R score) to sort pathways as integrated results. Moreover, to the best of our knowledge, CGPS is the first GSE ensemble method built based on a priori knowledge of pathways and phenotypes. Compared with 10 widely used individual methods and five types of ensemble scores from two ensemble methods, we demonstrate that sorting pathways based on the R score can better prioritize relevant pathways, as established by an evaluation of 120 simulated datasets and 45 real datasets.Additionally, CGPS is applied to expression data involving the drug panobinostat, which is an anticancer treatment against multiple myeloma. The results identify cell processes associated with cancer, such as the p53 signaling pathway(hsa04115); by contrast, according to two ensemble methods(EnrichmentBrowser and EGSEA), this pathway has a rank higher than 20, which may cause users to miss the pathway in their analyses. We show that this method, which is based on a priori knowledge, can capture valuable biological information from numerous types of gene set collections, such as KEGG pathways, GO terms, Reactome, and BioCarta. CGPS is publicly available as a standalone source code at ftp://ftp.cbi.pku.edu.cn/pub/CGPS_download/cgps-1.0.0.tar.gz.
基金the National Natural Science Foundation of China,No.39500129
文摘AIM To clone core gene cDNA of Chinesehepatitis C virus(HCV)into eukaryoticexpression vector cosmid pTM3 and to expressHCV core antigen in HepG2 cells.METHODS Core gene cDNA of HCV wasintroduced into eukaryotic expression vectorcosmid pTM3.Using vaccinia virus/bacteriophage T7 hybrid expression system,HepG2 cells were transfected with therecombinant plasmid pTM3-Q534 by lipofectin.RESULTS From the transfected bacteriaTop10F’,2 pTM3-Q534 clones containing therecombinant plasmid were identified fromrandomly selected 10 ampicillin-resistantcolonies.By reverse transcription PCR andindirect immunofluorescence technique,HCVRNA and core protein was identified in HepG2cells transfected with the recombinant plasmid.CONCLUSION The construction of arecombinant plasmid and the expression of coregene cDNA of HCV in HepG2 was successful.