多任务学习利用不同任务之间的相似性辅助决策,与单任务学习相比,多任务学习能够利用更多的信息,从而可以弥补单任务学习信息利用不足的缺陷。本文选择NTCIR-ECA数据集中的中文和英文文本数据作为实验数据,以情感原因分析作为研究任务,...多任务学习利用不同任务之间的相似性辅助决策,与单任务学习相比,多任务学习能够利用更多的信息,从而可以弥补单任务学习信息利用不足的缺陷。本文选择NTCIR-ECA数据集中的中文和英文文本数据作为实验数据,以情感原因分析作为研究任务,提出了一种结合多任务学习和深度学习的模型MTDLM(multi-task deep learning model),实现不同语种下的情感原因分析。实验结果表明,在数据不平衡的情况下,MTDLM模型对英文语种的情感原因识别的最优F值为39%,优于单任务学习(F值为0)和传统基线模型(LR的F值为33%),从而验证了模型的有效性。展开更多
Analysis of molecular mechanisms that lead to the development of various types of tumors is essential for biology and medicine,because it may help to find new therapeutic opportunities for cancer treatment and cure in...Analysis of molecular mechanisms that lead to the development of various types of tumors is essential for biology and medicine,because it may help to find new therapeutic opportunities for cancer treatment and cure including personalized treatment approaches.One of the pathways known to be important for the development of neoplastic diseases and pathological processes is the Hedgehog signaling pathway that normally controls human embryonic development.Systematic accumulation of various types of biological data,including interactions between proteins,regulation of genes transcription,proteomics,and metabolomics experiments results,allows the application of computational analysis of these big data for identification of key molecular mechanisms of certain diseases and pathologies and promising therapeutic targets.The aim of this study is to develop a computational approach for revealing associations between human proteins and genes interacting with the Hedgehog pathway components,as well as for identifying their roles in the development of various types of tumors.We automatically collect sets of abstract texts from the NCBI PubMed bibliographic database.For recognition of the Hedgehog pathway proteins and genes and neoplastic diseases we use a dictionary-based named entity recognition approach,while for all other proteins and genes machine learning method is used.For association extraction,we develop a set of semantic rules.We complete the results of the text analysis with the gene set enrichment analysis.The identified key pathways that may influence the Hedgehog pathway and their roles in tumor development are then verified using the information in the literature.展开更多
Melatonin is a pleiotropic molecule that,after a short-term sleep deprivation,promotes the proliferation of neural stem cells in the adult hippocampus.However,this effect has not been observed in long-term sleep depri...Melatonin is a pleiotropic molecule that,after a short-term sleep deprivation,promotes the proliferation of neural stem cells in the adult hippocampus.However,this effect has not been observed in long-term sleep deprivation.The precise mechanism exerted by melatonin on the modulation of neural stem cells is not entirely elucidated,but evidence indicates that epigenetic regulators may be involved in this process.In this study,we investigated the effect of melatonin treatment during a 96-hour sleep deprivation and analyzed the expression of epigenetic modulators predicted by computational text mining and keyword clusterization.Our results showed that the administration of melatonin under sleep-deprived conditions increased the MECP2 expression and reduced the SIRT1 expression in the dentate gyrus.We observed that let-7 b,mir-132,and mir-124 were highly expressed in the dentate gyrus after melatonin administration,but they were not modified by sleep deprivation.In addition,we found more Sox2^+/5-bromo-2’-deoxyuridine(BrdU)^+cells in the subgranular zone of the sleep-deprived group treated with melatonin than in the untreated group.These findings may support the notion that melatonin modifies the expression of epigenetic mediators that,in turn,regulate the proliferation of neural progenitor cells in the adult dentate gyrus under long-term sleep-deprived conditions.All procedures performed in this study were approved by the Animal Ethics Committee of the University of Guadalajara,Mexico(approval No.CI-16610)on January 2,2016.展开更多
Purpose: Our study proposes a bootstrapping-based method to automatically extract data- usage statements from academic texts. Design/methodology/approach: The method for data-usage statements extraction starts with ...Purpose: Our study proposes a bootstrapping-based method to automatically extract data- usage statements from academic texts. Design/methodology/approach: The method for data-usage statements extraction starts with seed entities and iteratively learns patterns and data-usage statements from unlabeled text. In each iteration, new patterns are constructed and added to the pattern list based on their calculated score. Three seed-selection strategies are also proposed in this paper. Findings: The performance of the method is verified by means of experiments on real data collected from computer science journals. The results show that the method can achieve satisfactory performance regarding precision of extraction and extensibility of obtained patterns. Research limitations: While the triple representation of sentences is effective and efficient for extracting data-usage statements, it is unable to handle complex sentences. Additional features that can address complex sentences should thus be explored in the future. Practical implications: Data-usage statements extraction is beneficial for data-repository construction and facilitates research on data-usage tracking, dataset-based scholar search, and dataset evaluation. Originality/value: To the best of our knowledge, this paper is among the first to address the important task of automatically extracting data-usage statements from real data.展开更多
Cellular senescence is an irreversible cell cycle arrest program in response to various exogenous and endogenous stimuli like telomere dysfunction and DNA damage.It has been widely accepted as an antitumor program and...Cellular senescence is an irreversible cell cycle arrest program in response to various exogenous and endogenous stimuli like telomere dysfunction and DNA damage.It has been widely accepted as an antitumor program and is also found closely related to embryo development,tissue repair,organismal aging and age-related degenerative diseases.In the past decades,numerous efforts have been made to uncover the gene regulatory mechanisms of cellular senescence.There is a strong demand to integrate these data from various resources into one open platform.To facilitate researchers on cellular senescence,we have developed Human Cellular Senescence Gene Database(HCSGD) by integrating multiple online published data sources into a comprehensive senescence gene annotation platform(http://bioinfo.au.tsinghua.edu.cn/member/xwwang/HCSGD).Potential Human Cellular Senescence Genes(HCSGS)were collected by combining information from published literatures,gene expression profiling data and Protein-Protein Interaction networks.Additionally,genes are annotated with gene ontology annotation and microRNA/drug/compound target information.HCSGD provides a valuable resource to visualize cellular senescence gene networks,browse annotated functional information,and retrieve senescenceassociated genes with a user-friendly web interface.展开更多
文摘多任务学习利用不同任务之间的相似性辅助决策,与单任务学习相比,多任务学习能够利用更多的信息,从而可以弥补单任务学习信息利用不足的缺陷。本文选择NTCIR-ECA数据集中的中文和英文文本数据作为实验数据,以情感原因分析作为研究任务,提出了一种结合多任务学习和深度学习的模型MTDLM(multi-task deep learning model),实现不同语种下的情感原因分析。实验结果表明,在数据不平衡的情况下,MTDLM模型对英文语种的情感原因识别的最优F值为39%,优于单任务学习(F值为0)和传统基线模型(LR的F值为33%),从而验证了模型的有效性。
基金This work was supported by the Ministry of Science and Higher Education of the Russian Federation within the framework of state support for the creation and development of World-Class Research Centers'Digital Biodesign and Personalized Healthcare'(No.75-15-2022-305).
文摘Analysis of molecular mechanisms that lead to the development of various types of tumors is essential for biology and medicine,because it may help to find new therapeutic opportunities for cancer treatment and cure including personalized treatment approaches.One of the pathways known to be important for the development of neoplastic diseases and pathological processes is the Hedgehog signaling pathway that normally controls human embryonic development.Systematic accumulation of various types of biological data,including interactions between proteins,regulation of genes transcription,proteomics,and metabolomics experiments results,allows the application of computational analysis of these big data for identification of key molecular mechanisms of certain diseases and pathologies and promising therapeutic targets.The aim of this study is to develop a computational approach for revealing associations between human proteins and genes interacting with the Hedgehog pathway components,as well as for identifying their roles in the development of various types of tumors.We automatically collect sets of abstract texts from the NCBI PubMed bibliographic database.For recognition of the Hedgehog pathway proteins and genes and neoplastic diseases we use a dictionary-based named entity recognition approach,while for all other proteins and genes machine learning method is used.For association extraction,we develop a set of semantic rules.We complete the results of the text analysis with the gene set enrichment analysis.The identified key pathways that may influence the Hedgehog pathway and their roles in tumor development are then verified using the information in the literature.
基金supported by grants from Universidad de Guadalajara(PROSNI 2016,2017-8)to REGCpartially by grants from Consejo Nacional de Ciencia y Tecnologia(CONACyT No.PN 2016-01-465 and INFR-280414)+1 种基金PRODEP(213544)to OGPthe CONACyT Fellowship grant(374823)to AHG
文摘Melatonin is a pleiotropic molecule that,after a short-term sleep deprivation,promotes the proliferation of neural stem cells in the adult hippocampus.However,this effect has not been observed in long-term sleep deprivation.The precise mechanism exerted by melatonin on the modulation of neural stem cells is not entirely elucidated,but evidence indicates that epigenetic regulators may be involved in this process.In this study,we investigated the effect of melatonin treatment during a 96-hour sleep deprivation and analyzed the expression of epigenetic modulators predicted by computational text mining and keyword clusterization.Our results showed that the administration of melatonin under sleep-deprived conditions increased the MECP2 expression and reduced the SIRT1 expression in the dentate gyrus.We observed that let-7 b,mir-132,and mir-124 were highly expressed in the dentate gyrus after melatonin administration,but they were not modified by sleep deprivation.In addition,we found more Sox2^+/5-bromo-2’-deoxyuridine(BrdU)^+cells in the subgranular zone of the sleep-deprived group treated with melatonin than in the untreated group.These findings may support the notion that melatonin modifies the expression of epigenetic mediators that,in turn,regulate the proliferation of neural progenitor cells in the adult dentate gyrus under long-term sleep-deprived conditions.All procedures performed in this study were approved by the Animal Ethics Committee of the University of Guadalajara,Mexico(approval No.CI-16610)on January 2,2016.
基金supported by the National Natural Science Foundation of China (Grant No.:71473183)
文摘Purpose: Our study proposes a bootstrapping-based method to automatically extract data- usage statements from academic texts. Design/methodology/approach: The method for data-usage statements extraction starts with seed entities and iteratively learns patterns and data-usage statements from unlabeled text. In each iteration, new patterns are constructed and added to the pattern list based on their calculated score. Three seed-selection strategies are also proposed in this paper. Findings: The performance of the method is verified by means of experiments on real data collected from computer science journals. The results show that the method can achieve satisfactory performance regarding precision of extraction and extensibility of obtained patterns. Research limitations: While the triple representation of sentences is effective and efficient for extracting data-usage statements, it is unable to handle complex sentences. Additional features that can address complex sentences should thus be explored in the future. Practical implications: Data-usage statements extraction is beneficial for data-repository construction and facilitates research on data-usage tracking, dataset-based scholar search, and dataset evaluation. Originality/value: To the best of our knowledge, this paper is among the first to address the important task of automatically extracting data-usage statements from real data.
基金supported by the National Natural Science Foundation of China(No.31371341)Tsinghua University Initiative Scientific Research Program(No.20141081175)the Open Research Fund of State Key Laboratory of Bioelectronics,Southeast University
文摘Cellular senescence is an irreversible cell cycle arrest program in response to various exogenous and endogenous stimuli like telomere dysfunction and DNA damage.It has been widely accepted as an antitumor program and is also found closely related to embryo development,tissue repair,organismal aging and age-related degenerative diseases.In the past decades,numerous efforts have been made to uncover the gene regulatory mechanisms of cellular senescence.There is a strong demand to integrate these data from various resources into one open platform.To facilitate researchers on cellular senescence,we have developed Human Cellular Senescence Gene Database(HCSGD) by integrating multiple online published data sources into a comprehensive senescence gene annotation platform(http://bioinfo.au.tsinghua.edu.cn/member/xwwang/HCSGD).Potential Human Cellular Senescence Genes(HCSGS)were collected by combining information from published literatures,gene expression profiling data and Protein-Protein Interaction networks.Additionally,genes are annotated with gene ontology annotation and microRNA/drug/compound target information.HCSGD provides a valuable resource to visualize cellular senescence gene networks,browse annotated functional information,and retrieve senescenceassociated genes with a user-friendly web interface.