Background: Traditional Chinese medicine (TCM) treats diseases in a holistic manner, while TCM formulae are multi-component, multi-target agents at the molecular level. Thus there are many parallels between the key id...Background: Traditional Chinese medicine (TCM) treats diseases in a holistic manner, while TCM formulae are multi-component, multi-target agents at the molecular level. Thus there are many parallels between the key ideas of TCM pharmacology and network pharmacology. These years, TCM network pharmacology has developed as an interdisciplinary of TCM science and network pharmacology, which studies the mechanism of TCM at the molecular level and in the context of biological networks. It provides a new research paradigm that can use modern biomedical science to interpret the mechanism of TCM, which is promising to accelerate the modernization and internationalization of TCM? Results: In this paper we introduce state-of-the-art free data sources, web servers and softwares that can be used in the TCM network pharmacology, including databases of TCM, drug targets and diseases, web servers for the prediction of drug targets, and tools for network and functional analysis. Conclusions: This review could help experimental pharmacologists make better use of the existing data and methods in their study of TCM.展开更多
This article develops a polytopic linear pa- rameter varying (LPV) model and presents a non-fragile H2 gain-scheduled control for a flexible air-breathing hypersonic vehicle (FAHV). First, the polytopic LPV model ...This article develops a polytopic linear pa- rameter varying (LPV) model and presents a non-fragile H2 gain-scheduled control for a flexible air-breathing hypersonic vehicle (FAHV). First, the polytopic LPV model of the FAHV can be obtained by using Jacobian linearization and tensor-product (TP) model transfor- mation approach, simulation verification illustrates that the polytopic LPV model captures the local nonlinear- ities of the original nonlinear system. Second, based on the developed polytopic LPV model, a non-fragile gain- scheduled control method is proposed in order to reduce the fragility encountered in controller implementation, a convex optimisation problem with linear matrix in- equalities (LMIs) constraints is formulated for designing a velocity and altitude tracking controller, which guar- antees//2 control performance index. Finally, numerical simulations have demonstrated the effectiveness of the proposed approach.展开更多
Background:In the human genome,distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions.Although recently developed high-throughput experimental app...Background:In the human genome,distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions.Although recently developed high-throughput experimental approaches have allowed us to recognize potential enhancer-promoter interactions genome-wide,it is still largely unclear to what extent the sequence-level information encoded in our genome help guide such interactions.Methods:Here we report a new computational method (named "SPEID") using deep learning models to predict enhancer-promoter interactions based on sequence-based features only,when the locations of putative enhancers and promoters in a particular cell type are given.Results:Our results across six different cell types demonstrate that SPEID is effective in predicting enhancerpromoter interactions as compared to state-of-the-art methods that only use information from a single cell type.As a proof-of-principle,we also applied SPEID to identify somatic non-coding mutations in melanoma samples that may have reduced enhancer-promoter interactions in tumor genomes.Conclusions^ This work demonstrates that deep learning models can help reveal that sequence-based features alone are sufficient to reliably predict enhancer-promoter interactions genome-wide.展开更多
Automatic target recognition (ATR) is an important function for modern radar. High resolution range profile (HRRP) of target contains target struc- ture signatures, such as target size, scatterer distribu- tion, e...Automatic target recognition (ATR) is an important function for modern radar. High resolution range profile (HRRP) of target contains target struc- ture signatures, such as target size, scatterer distribu- tion, etc, which is a promising signature for ATR. Sta- tistical modeling of target HRRPs is the key stage for HRRP statistical recognition, including model selection and parameter estimation. For statistical recognition al- gorithms, it is generally assumed that the test samples follow the same distribution model as that of the train- ing data. Since the signal-to-noise ratio (SNR) of the received HRRP is a function of target distance, the as- sumption may be not met in practice. In this paper, we present a robust method for HRRP statistical recogni- tion when SNR of test HRRP is lower than that of train- ing samples. The noise is assumed independent Gaus- sian distributed, while HRRP is modeled by probabilistic principal component analysis (PPCA) model. Simulated experiments based on measured data show the effective- ness of the proposed method.展开更多
In order to achieve long-term covert precise navigation for an underwater vehicle,the shortcomings of various underwater navigation methods used are analyzed.Given the low navigation precision of underwater mapmatchin...In order to achieve long-term covert precise navigation for an underwater vehicle,the shortcomings of various underwater navigation methods used are analyzed.Given the low navigation precision of underwater mapmatching aided inertial navigation based on singlegeophysical information,a model of an underwater mapmatching aided inertial navigation system based on multigeophysical information(gravity,topography and geomagnetism)is put forward,and the key technologies of map-matching based on multi-geophysical information are analyzed.Iterative closest contour point(ICCP)mapmatching algorithm and data fusion based on Dempster-Shafer(D-S)evidence theory are applied to navigation simulation.Simulation results show that accumulation of errors with increasing of time and distance are restrained and fusion of multi-map-matching is superior to any single-map-matching,which can effectively determine the best match of underwater vehicle position and improve the accuracy of underwater vehicle navigation.展开更多
All eight possible extended rough set models in incomplete information systems are proposed.By analyzing existing extended models and technical meth-ods of rough set theory,the strategy of model extension is found to ...All eight possible extended rough set models in incomplete information systems are proposed.By analyzing existing extended models and technical meth-ods of rough set theory,the strategy of model extension is found to be suitable for processing incomplete information systems instead of filling possible values for missing attributes.After analyzing the definitions of existing extended models,a new general extended model is proposed.The new model is a generalization of indiscernibility relations,tolerance relations and non-symmetric similarity relations.Finally,suggestions for further study of rough set theory in incomplete informa-tion systems are put forward.展开更多
Cancer stem cell (CSC) theory suggests a cell-lineage structure in tumor cells in which CSCs are capable of giving rise to the other non-stem cancer cells (NSCCs) but not vice versa. However, an alternative scenar...Cancer stem cell (CSC) theory suggests a cell-lineage structure in tumor cells in which CSCs are capable of giving rise to the other non-stem cancer cells (NSCCs) but not vice versa. However, an alternative scenario of bidirectional interconversions between CSCs and NSCCs was proposed very recently. Here we present a general population model of cancer cells by integrating conventional cell divisions with direct conversions between different cell states, namely, not only can CSCs differentiate into NSCCs by asymmetric cell division, NSCCs can also dedifferentiate into CSCs by cell state conversion. Our theoretical model is validated when applying the model to recent experimental data. It is also found that the transient increase in CSCs proportion initiated from the purified NSCCs subpopulation cannot be well predicted by the conventional CSC model where the conversion from NSCCs to CSCs is forbidden, implying that the cell state conversion is required especially for the transient dynamics. The theoretical analysis also gives the condition such that our general model can be equivalently reduced into a simple Markov chain with only cell state transitions keeping the same cell proportion dynamics.展开更多
Many existing bioinformatics predictors are based on machine learning technology. When applying these predictors in practical studies, their predictive performances should be well understood. Different performance mea...Many existing bioinformatics predictors are based on machine learning technology. When applying these predictors in practical studies, their predictive performances should be well understood. Different performance measures are applied in various studies as well as different evaluation methods. Even for the same performance measure, different terms, nomenclatures or notations may appear in different context. Results: We carried out a review on the most commonly used performance measures and the evaluation methods for bioinformatics predictors. Conclusions: It is important in bioinformatics to correctly understand and interpret the performance, as it is the key to rigorously compare performances of different predictors and to choose the right predictor.展开更多
The cis-acting regulatory elements, e.g., promoters and ribosome binding sites (RBSs) with various desired properties, are building blocks widely used in synthetic biology for fine tuning gene expression. In the las...The cis-acting regulatory elements, e.g., promoters and ribosome binding sites (RBSs) with various desired properties, are building blocks widely used in synthetic biology for fine tuning gene expression. In the last decade, acquisition of a controllable regulatory element from a random library has been established and applied to control the protein expression and metabolic flux in different chassis cells. However, more rational strategies are still urgently needed to improve the efficiency and reduce the laborious screening and multifaceted characterizations. Building precise computational models that can predict the activity of regulatory elements and quantitatively design elements with desired strength have been demonstrated tremendous potentiality. Here, recent progress on construction of cis- acting regulatory element library and the quantitative predicting models for design of such elements are reviewed and discussed in detail.展开更多
Non-smooth or even abrupt state changes exist during many biological processes, e.g., cell differentiation processes, proliferation processes, or even disease deterioration processes. Such dynamics generally signals t...Non-smooth or even abrupt state changes exist during many biological processes, e.g., cell differentiation processes, proliferation processes, or even disease deterioration processes. Such dynamics generally signals the emergence of critical transition phenomena, which result in drastic changes of system states or eventually qualitative changes of phenotypes. Hence, it is of great importance to detect such transitions and further reveal their molecular mechanisms at network level. Here, we review the recent advances on dynamical network biomarkers (DNBs) as well as the related theoretical foundation, which can identify not only early signals of the critical transitions but also their leading networks, which drive the whole system to initiate such transitions. In order to demonstrate the effectiveness of this novel approach, examples of complex diseases are also provided to detect pre-disease stage, for which traditional methods or biomarkers failed.展开更多
Background: Marker detection is an important task in complex disease studies. Here we provide an association rule mining (ARM) based approach for identifying integrated markers through mutual information (MI) bas...Background: Marker detection is an important task in complex disease studies. Here we provide an association rule mining (ARM) based approach for identifying integrated markers through mutual information (MI) based statistically significant feature extraction, and apply it to acute myeloid leukemia (AML) and prostate carcinoma (PC) gene expression and methylation profiles. Methods: We first collect the genes having both expression and methylation values in AML as well as PC. Next, we run Jarque-Bera normality test on the expression/methylation data to divide the whole dataset into two parts: one that follows normal distribution and the other that does not follow normal distribution. Thus, we have now four parts of the dataset: normally distributed expression data, normally distributed methylation data, non-normally distributed expression data, and non-normally distributed methylated data. A feature-extraction technique, "mRMR" is then utilized on each part. This results in a list of top-ranked genes. Next, we apply Welch t-test (parametric test) and Shrink t-test (non-parametric test) on the expression/methylation data for the top selected normally distributed genes and non-normally distributed genes, respectively. We then use a recent weighted ARM method, "RANWAR" to combine all/specific resultant genes to generate top oncogenic rules along with respective integrated markers. Finally, we perform literature search as well as KEGG pathway and Gene-Ontology (GO) analyses using Enrichr database for in silico validation of the prioritized oncogenes as the markers and labeling the markers as existing or novel. Results: The novel markers of AML are {ABCB11↑ U KRT17↓} (i.e., ABCBll as up-regulated, & KRT17 as down- regulated), and {AP1SI-UKRT17↓ U NEIL2-UDYDC1↓}) (i.e., AP1S1 and NEIL2 both as hypo-methylated, & KRT17 and DYDC1 both as down-regulated). The novel marker of PC is {UBIAD1 ||U APBA2 U C4orf31: (i.e., UBIAD1 as up-regulated and hypo-展开更多
As the strain sensing element of a structural health monitoring,the study and the application of the fibre-optic bragg grating(FBG)have been widely accepted.The accuracy of the FBG sensor is highly dependent on the ph...As the strain sensing element of a structural health monitoring,the study and the application of the fibre-optic bragg grating(FBG)have been widely accepted.The accuracy of the FBG sensor is highly dependent on the physical and the mechanical properties of the strain interface transferring characteristics among the layers of bare optical fibre,protective coating,adhesive layer and host material.In this paper,firstly,the general expression of the multilayer interface strain transferring mechanism is derived.Secondly,based on the defined average strain,the error-modified equation of the FBG sensor is obtained.Finally,in the light of the embedded tube-packaged FBG and the fibre reinforced polymer-optical fibre bragg grating(FRP-OFBG)strain sensors,developed in the Harbin Institute of Technology(HIT),the corresponding strain transferring laws have been studied,and the corresponding error modification coefficients have also been given,which are validated by experiments.The research results provide theories for the development and application of the embedded FBG sensors.展开更多
Background: The increase in global population, climate change and stagnancy in crop yield on unit land area basis in recent decades urgently call for a new approach to support contemporary crop improvements, ePlant i...Background: The increase in global population, climate change and stagnancy in crop yield on unit land area basis in recent decades urgently call for a new approach to support contemporary crop improvements, ePlant is a mathematical model of plant growth and development with a high level of mechanistic details to meet this challenge. Results: ePlant integrates modules developed for processes occurring at drastically different temporal (10-8-106 seconds) and spatial (10-10-10 meters) scales, incorporating diverse physical, biophysical and biochemical processes including gene regulation, metabolic reaction, substrate transport and diffusion, energy absorption, transfer and conversion, organ morphogenesis, plant environment interaction, etc. Individual modules are developed using a divide-and-conquer approach; modules at different temporal and spatial scales are integrated through transfer variables. We further propose a supervised learning procedure based on information geometry to combine model and data for both knowledge discovery and model extension or advances. We finally discuss the recent formation of a global consortium, which includes experts in plant biology, computer science, statistics, agronomy, phenomics, etc. aiming to expedite the development and application of ePlant or its equivalents by promoting a new model development paradigm where models are developed as a community effort instead of driven mainly by individual labs' effort. Conclusions: ePlant, as a major research tool to support quantitative and predictive plant science research, will play a crucial role in the future model guided crop engineering, breeding and agronomy.展开更多
Experimental evidences and theoretical analyses have amply suggested that in cancer genesis and progression genetic information is very important but not the whole. Nevertheless, "cancer as a disease of the genome" ...Experimental evidences and theoretical analyses have amply suggested that in cancer genesis and progression genetic information is very important but not the whole. Nevertheless, "cancer as a disease of the genome" is still currently the dominant doctrine. With such a background and based on the fundamental properties of biological systems, a new endogenous molecular-cellular network theory for cancer was recently proposed by us. Similar proposals were also made by others. The new theory attempts to incorporate both genetic and environmental effects into one single framework, with the possibility to give a quantitative and dynamical description. It is asserted that the complex regulatory machinery behind biological processes may be modeled by a nonlinear stochastic dynamical system similar to a noise perturbed Morse-Smale system. Both qualitative and quantitative descriptions may be obtained. The dynamical variables are specified by a set of endogenous molecular-cellular agents and the structure of the dynamical system by the interactions among those biological agents. Here we review this theory from a pedagogical angle which emphasizes the role of modularization, hierarchy and autonomous regulation. We discuss how the core set of assumptions is exemplified in detail in one of the simple, important and well studied model organisms, Phage lambda. With this concrete and quantitative example in hand, we show that the application of the hypothesized theory in human cancer, such as hepatocellular carcinoma (HCC), is plausible, and that it may provide a set of new insights on understanding cancer genesis and progression, and on strategies for cancer prevention, cure, and care.展开更多
Non-negative matrix factorization(NMF)is a recently popularized technique for learning partsbased,linear representations of non-negative data.The traditional NMF is optimized under the Gaussian noise or Poisson noise ...Non-negative matrix factorization(NMF)is a recently popularized technique for learning partsbased,linear representations of non-negative data.The traditional NMF is optimized under the Gaussian noise or Poisson noise assumption,and hence not suitable if the data are grossly corrupted.To improve the robustness of NMF,a novel algorithm named robust nonnegative matrix factorization(RNMF)is proposed in this paper.We assume that some entries of the data matrix may be arbitrarily corrupted,but the corruption is sparse.RNMF decomposes the non-negative data matrix as the summation of one sparse error matrix and the product of two non-negative matrices.An efficient iterative approach is developed to solve the optimization problem of RNMF.We present experimental results on two face databases to verify the effectiveness of the proposed method.展开更多
Background: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number...Background: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number of features increases. To address this limitation, feature elimination Random Forests was proposed that only uses features with the largest variable importance scores. Yet the performance of this method is not satisfying, possibly due to its rigid feature selection, and increased correlations between trees of forest. Methods: We propose variable importance-weighted Random Forests, which instead of sampling features with equal probability at each node to build up trees, samples features according to their variable importance scores, and then select the best split from the randomly selected features. Results: We evaluate the performance of our method through comprehensive simulation and real data analyses, for both regression and classification. Compared to the standard Random Forests and the feature elimination Random Forests methods, our proposed method has improved performance in most cases. Conclusions: By incorporating the variable importance scores into the random feature selection step, our method can better utilize more informative features without completely ignoring less informative ones, hence has improved prediction accuracy in the presence of weak signals and large noises. We have implemented an R package "viRandomForests" based on the original R package "randomForest" and it can be freely downloaded from http:// zhaocenter.org/software.展开更多
In this article,we attempt to document a technical overview on modern miniature unmanned rotorcraft systems.We first give a brief review on the historical development of the rotorcraft unmanned aerial vehicles(UAVs),a...In this article,we attempt to document a technical overview on modern miniature unmanned rotorcraft systems.We first give a brief review on the historical development of the rotorcraft unmanned aerial vehicles(UAVs),and then move on to present a fairly detailed and general overview on the hardware configuration,software integration,aerodynamic modeling and automatic flight control system involved in constructing the unmanned system.The applications of the emerging technology in the military and civilian domains are also highlighted.展开更多
Background:In recent years,since the molecular docking technique can greatly improve the efficiency and reduce the research cost,it has become a key tool in computer-assisted drug design to predict the binding affinit...Background:In recent years,since the molecular docking technique can greatly improve the efficiency and reduce the research cost,it has become a key tool in computer-assisted drug design to predict the binding affinity and analyze the interactive mode.Results:This study introduces the key principles,procedures and the widely-used applications for molecular docking.Also,it compares the commonly used docking applications and recommends which research areas are suitable for them.Lastly,it briefly reviews the latest progress in molecular docking such as the integrated method and deep learning.Conclusion:Limited to the incomplete molecular structure and the shortcomings of the scoring function,current docking applications are not accurate enough to predict the binding affinity.However,we could improve the current molecular docking technique by integrating the big biological data into scoring function.展开更多
The specificity of protein-DNA interactions is most commonly modeled using position weight matrices (PWMs). First introduced in 1982, they have been adapted to many new types of data and many different approaches ha...The specificity of protein-DNA interactions is most commonly modeled using position weight matrices (PWMs). First introduced in 1982, they have been adapted to many new types of data and many different approaches have been developed to determine the parameters of the PWM. New high-throughput technologies provide a large amount of data rapidly and offer an unprecedented opportunity to determine accurately the specificities of many transcription factors (TFs). But taking full advantage of the new data requires advanced algorithms that take into account the biophysical processes involved in generating the data. The new large datasets can also aid in determining when the PWM model is inadequate and must be extended to provide accurate predictions of binding sites. This article provides a general mathematical description of a PWM and how it is used to score potential binding sites, a brief history of the approaches that have been developed and the types of data that are used with an emphasis on algorithms that we have developed for analyzing high-throughput datasets from several new technologies. It also describes extensions that can be added when the simple PWM model is inadequate and further enhancements that may be necessary, it briefly describes some applications of PWMs in the discovery and modeling of in vivo regulatory networks.展开更多
A new adaptive mutation particle swarm optimizer,which is based on the variance of the population's fitness,is presented in this paper.During the running time,the mutation probability for the current best particle...A new adaptive mutation particle swarm optimizer,which is based on the variance of the population's fitness,is presented in this paper.During the running time,the mutation probability for the current best particle is determined by two factors:the variance of the population's fitness and the current optimal solution.The ability of particle swarm optimization(PSO)algorithm to break away from the local optimum is greatly improved by the mutation.The experimental results show that the new algorithm not only has great advantage of convergence property over genetic algorithm and PSO,but can also avoid the premature convergence problem effectively.展开更多
基金the National Natural Science Foundation of China (Nos. 81520108030, 21472238,61372194 and 81260672)Professor of Chang Jiang Scholars Program, Shanghai Engineering Research Center for the Preparation of Bioactive Natural Products (No. 16DZ2280200)+3 种基金the Scientific Foundation of Shanghai China (Nos. 13401900103 and 13401900101)the National Key Research and Development Program of China (No. 2017YFC1700200)the Natural Science Foundation of Chongqing (No. cstc2018jcyjAX0090)Chongqing Education Reform Project of Graduate (No. yjgl52017).
文摘Background: Traditional Chinese medicine (TCM) treats diseases in a holistic manner, while TCM formulae are multi-component, multi-target agents at the molecular level. Thus there are many parallels between the key ideas of TCM pharmacology and network pharmacology. These years, TCM network pharmacology has developed as an interdisciplinary of TCM science and network pharmacology, which studies the mechanism of TCM at the molecular level and in the context of biological networks. It provides a new research paradigm that can use modern biomedical science to interpret the mechanism of TCM, which is promising to accelerate the modernization and internationalization of TCM? Results: In this paper we introduce state-of-the-art free data sources, web servers and softwares that can be used in the TCM network pharmacology, including databases of TCM, drug targets and diseases, web servers for the prediction of drug targets, and tools for network and functional analysis. Conclusions: This review could help experimental pharmacologists make better use of the existing data and methods in their study of TCM.
文摘This article develops a polytopic linear pa- rameter varying (LPV) model and presents a non-fragile H2 gain-scheduled control for a flexible air-breathing hypersonic vehicle (FAHV). First, the polytopic LPV model of the FAHV can be obtained by using Jacobian linearization and tensor-product (TP) model transfor- mation approach, simulation verification illustrates that the polytopic LPV model captures the local nonlinear- ities of the original nonlinear system. Second, based on the developed polytopic LPV model, a non-fragile gain- scheduled control method is proposed in order to reduce the fragility encountered in controller implementation, a convex optimisation problem with linear matrix in- equalities (LMIs) constraints is formulated for designing a velocity and altitude tracking controller, which guar- antees//2 control performance index. Finally, numerical simulations have demonstrated the effectiveness of the proposed approach.
基金the National Science Foundation (1252522 to Shashank Singh,1054309 and 1262575 to Jian Ma)the National Institutes of Health (HG007352 and DK107965 to Jian Ma).
文摘Background:In the human genome,distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions.Although recently developed high-throughput experimental approaches have allowed us to recognize potential enhancer-promoter interactions genome-wide,it is still largely unclear to what extent the sequence-level information encoded in our genome help guide such interactions.Methods:Here we report a new computational method (named "SPEID") using deep learning models to predict enhancer-promoter interactions based on sequence-based features only,when the locations of putative enhancers and promoters in a particular cell type are given.Results:Our results across six different cell types demonstrate that SPEID is effective in predicting enhancerpromoter interactions as compared to state-of-the-art methods that only use information from a single cell type.As a proof-of-principle,we also applied SPEID to identify somatic non-coding mutations in melanoma samples that may have reduced enhancer-promoter interactions in tumor genomes.Conclusions^ This work demonstrates that deep learning models can help reveal that sequence-based features alone are sufficient to reliably predict enhancer-promoter interactions genome-wide.
文摘Automatic target recognition (ATR) is an important function for modern radar. High resolution range profile (HRRP) of target contains target struc- ture signatures, such as target size, scatterer distribu- tion, etc, which is a promising signature for ATR. Sta- tistical modeling of target HRRPs is the key stage for HRRP statistical recognition, including model selection and parameter estimation. For statistical recognition al- gorithms, it is generally assumed that the test samples follow the same distribution model as that of the train- ing data. Since the signal-to-noise ratio (SNR) of the received HRRP is a function of target distance, the as- sumption may be not met in practice. In this paper, we present a robust method for HRRP statistical recogni- tion when SNR of test HRRP is lower than that of train- ing samples. The noise is assumed independent Gaus- sian distributed, while HRRP is modeled by probabilistic principal component analysis (PPCA) model. Simulated experiments based on measured data show the effective- ness of the proposed method.
基金This work was supported by the National Defense Pre-Research Foundation of China.
文摘In order to achieve long-term covert precise navigation for an underwater vehicle,the shortcomings of various underwater navigation methods used are analyzed.Given the low navigation precision of underwater mapmatching aided inertial navigation based on singlegeophysical information,a model of an underwater mapmatching aided inertial navigation system based on multigeophysical information(gravity,topography and geomagnetism)is put forward,and the key technologies of map-matching based on multi-geophysical information are analyzed.Iterative closest contour point(ICCP)mapmatching algorithm and data fusion based on Dempster-Shafer(D-S)evidence theory are applied to navigation simulation.Simulation results show that accumulation of errors with increasing of time and distance are restrained and fusion of multi-map-matching is superior to any single-map-matching,which can effectively determine the best match of underwater vehicle position and improve the accuracy of underwater vehicle navigation.
基金supported by the National Natural Science Foundation of China(Grant Nos.60573068,60773113)the Program for New Century Excellent Talents in University(NCET),and the Natural Science Foundation of Chongqing of China(No.2008BA2017).
文摘All eight possible extended rough set models in incomplete information systems are proposed.By analyzing existing extended models and technical meth-ods of rough set theory,the strategy of model extension is found to be suitable for processing incomplete information systems instead of filling possible values for missing attributes.After analyzing the definitions of existing extended models,a new general extended model is proposed.The new model is a generalization of indiscernibility relations,tolerance relations and non-symmetric similarity relations.Finally,suggestions for further study of rough set theory in incomplete informa-tion systems are put forward.
文摘Cancer stem cell (CSC) theory suggests a cell-lineage structure in tumor cells in which CSCs are capable of giving rise to the other non-stem cancer cells (NSCCs) but not vice versa. However, an alternative scenario of bidirectional interconversions between CSCs and NSCCs was proposed very recently. Here we present a general population model of cancer cells by integrating conventional cell divisions with direct conversions between different cell states, namely, not only can CSCs differentiate into NSCCs by asymmetric cell division, NSCCs can also dedifferentiate into CSCs by cell state conversion. Our theoretical model is validated when applying the model to recent experimental data. It is also found that the transient increase in CSCs proportion initiated from the purified NSCCs subpopulation cannot be well predicted by the conventional CSC model where the conversion from NSCCs to CSCs is forbidden, implying that the cell state conversion is required especially for the transient dynamics. The theoretical analysis also gives the condition such that our general model can be equivalently reduced into a simple Markov chain with only cell state transitions keeping the same cell proportion dynamics.
基金This work was supported by the National Natural Science Foundation of China (NSFC 61005041), the Specialized Research Natural Fund for the Doctoral Program of Higher Education (SRFDP 20100032120039),Tianjin Natural Science Foundation (No. 12JCQNJC02300), and China Postdoc- toral Science Foundation (Nos. 2012T50240 and 2013M530114).
文摘Many existing bioinformatics predictors are based on machine learning technology. When applying these predictors in practical studies, their predictive performances should be well understood. Different performance measures are applied in various studies as well as different evaluation methods. Even for the same performance measure, different terms, nomenclatures or notations may appear in different context. Results: We carried out a review on the most commonly used performance measures and the evaluation methods for bioinformatics predictors. Conclusions: It is important in bioinformatics to correctly understand and interpret the performance, as it is the key to rigorously compare performances of different predictors and to choose the right predictor.
基金This work was supported by the National Basic Research Program of China (973 Program, grant No. 2012CB721104), the National High Technology Research and Development Program (863 Program, grant No. 2012AA02A701), the National Natural Science Foundation of China (grant Nos. 31170101 and 31301017), and the Natural Science Foundation of Guangdong Province, China (grant No. 2015A030310317).
文摘The cis-acting regulatory elements, e.g., promoters and ribosome binding sites (RBSs) with various desired properties, are building blocks widely used in synthetic biology for fine tuning gene expression. In the last decade, acquisition of a controllable regulatory element from a random library has been established and applied to control the protein expression and metabolic flux in different chassis cells. However, more rational strategies are still urgently needed to improve the efficiency and reduce the laborious screening and multifaceted characterizations. Building precise computational models that can predict the activity of regulatory elements and quantitatively design elements with desired strength have been demonstrated tremendous potentiality. Here, recent progress on construction of cis- acting regulatory element library and the quantitative predicting models for design of such elements are reviewed and discussed in detail.
文摘Non-smooth or even abrupt state changes exist during many biological processes, e.g., cell differentiation processes, proliferation processes, or even disease deterioration processes. Such dynamics generally signals the emergence of critical transition phenomena, which result in drastic changes of system states or eventually qualitative changes of phenotypes. Hence, it is of great importance to detect such transitions and further reveal their molecular mechanisms at network level. Here, we review the recent advances on dynamical network biomarkers (DNBs) as well as the related theoretical foundation, which can identify not only early signals of the critical transitions but also their leading networks, which drive the whole system to initiate such transitions. In order to demonstrate the effectiveness of this novel approach, examples of complex diseases are also provided to detect pre-disease stage, for which traditional methods or biomarkers failed.
文摘Background: Marker detection is an important task in complex disease studies. Here we provide an association rule mining (ARM) based approach for identifying integrated markers through mutual information (MI) based statistically significant feature extraction, and apply it to acute myeloid leukemia (AML) and prostate carcinoma (PC) gene expression and methylation profiles. Methods: We first collect the genes having both expression and methylation values in AML as well as PC. Next, we run Jarque-Bera normality test on the expression/methylation data to divide the whole dataset into two parts: one that follows normal distribution and the other that does not follow normal distribution. Thus, we have now four parts of the dataset: normally distributed expression data, normally distributed methylation data, non-normally distributed expression data, and non-normally distributed methylated data. A feature-extraction technique, "mRMR" is then utilized on each part. This results in a list of top-ranked genes. Next, we apply Welch t-test (parametric test) and Shrink t-test (non-parametric test) on the expression/methylation data for the top selected normally distributed genes and non-normally distributed genes, respectively. We then use a recent weighted ARM method, "RANWAR" to combine all/specific resultant genes to generate top oncogenic rules along with respective integrated markers. Finally, we perform literature search as well as KEGG pathway and Gene-Ontology (GO) analyses using Enrichr database for in silico validation of the prioritized oncogenes as the markers and labeling the markers as existing or novel. Results: The novel markers of AML are {ABCB11↑ U KRT17↓} (i.e., ABCBll as up-regulated, & KRT17 as down- regulated), and {AP1SI-UKRT17↓ U NEIL2-UDYDC1↓}) (i.e., AP1S1 and NEIL2 both as hypo-methylated, & KRT17 and DYDC1 both as down-regulated). The novel marker of PC is {UBIAD1 ||U APBA2 U C4orf31: (i.e., UBIAD1 as up-regulated and hypo-
基金supported by the National Natural Science Foundation of China(Grant Nos.50308008,10672048,50278029)the China Postdoctoral Science Foundation.
文摘As the strain sensing element of a structural health monitoring,the study and the application of the fibre-optic bragg grating(FBG)have been widely accepted.The accuracy of the FBG sensor is highly dependent on the physical and the mechanical properties of the strain interface transferring characteristics among the layers of bare optical fibre,protective coating,adhesive layer and host material.In this paper,firstly,the general expression of the multilayer interface strain transferring mechanism is derived.Secondly,based on the defined average strain,the error-modified equation of the FBG sensor is obtained.Finally,in the light of the embedded tube-packaged FBG and the fibre reinforced polymer-optical fibre bragg grating(FRP-OFBG)strain sensors,developed in the Harbin Institute of Technology(HIT),the corresponding strain transferring laws have been studied,and the corresponding error modification coefficients have also been given,which are validated by experiments.The research results provide theories for the development and application of the embedded FBG sensors.
基金The work in XGZ's lab is supported by CAS strategic leading project on designer breeding by molecular module (No. XDA08020301), the National High Technology Development Plan of the Ministry of Science and Technology of China (2014AA101601), the National Natural Science Foundation of China (No. C020401), the National Key Basic Research Program of China (No. 2015CB150104), Bill and Melinda Gates Foundation (No. OPP1060461), CAS-CSIRO Cooperative Research Program (No. GJHZ1501).
文摘Background: The increase in global population, climate change and stagnancy in crop yield on unit land area basis in recent decades urgently call for a new approach to support contemporary crop improvements, ePlant is a mathematical model of plant growth and development with a high level of mechanistic details to meet this challenge. Results: ePlant integrates modules developed for processes occurring at drastically different temporal (10-8-106 seconds) and spatial (10-10-10 meters) scales, incorporating diverse physical, biophysical and biochemical processes including gene regulation, metabolic reaction, substrate transport and diffusion, energy absorption, transfer and conversion, organ morphogenesis, plant environment interaction, etc. Individual modules are developed using a divide-and-conquer approach; modules at different temporal and spatial scales are integrated through transfer variables. We further propose a supervised learning procedure based on information geometry to combine model and data for both knowledge discovery and model extension or advances. We finally discuss the recent formation of a global consortium, which includes experts in plant biology, computer science, statistics, agronomy, phenomics, etc. aiming to expedite the development and application of ePlant or its equivalents by promoting a new model development paradigm where models are developed as a community effort instead of driven mainly by individual labs' effort. Conclusions: ePlant, as a major research tool to support quantitative and predictive plant science research, will play a crucial role in the future model guided crop engineering, breeding and agronomy.
文摘Experimental evidences and theoretical analyses have amply suggested that in cancer genesis and progression genetic information is very important but not the whole. Nevertheless, "cancer as a disease of the genome" is still currently the dominant doctrine. With such a background and based on the fundamental properties of biological systems, a new endogenous molecular-cellular network theory for cancer was recently proposed by us. Similar proposals were also made by others. The new theory attempts to incorporate both genetic and environmental effects into one single framework, with the possibility to give a quantitative and dynamical description. It is asserted that the complex regulatory machinery behind biological processes may be modeled by a nonlinear stochastic dynamical system similar to a noise perturbed Morse-Smale system. Both qualitative and quantitative descriptions may be obtained. The dynamical variables are specified by a set of endogenous molecular-cellular agents and the structure of the dynamical system by the interactions among those biological agents. Here we review this theory from a pedagogical angle which emphasizes the role of modularization, hierarchy and autonomous regulation. We discuss how the core set of assumptions is exemplified in detail in one of the simple, important and well studied model organisms, Phage lambda. With this concrete and quantitative example in hand, we show that the application of the hypothesized theory in human cancer, such as hepatocellular carcinoma (HCC), is plausible, and that it may provide a set of new insights on understanding cancer genesis and progression, and on strategies for cancer prevention, cure, and care.
基金This work was supported by the Scholarship Award for Excellent Doctoral Student granted by Ministry of Education,and the National Natural Science Foundation of China(Grant No.60875044)。
文摘Non-negative matrix factorization(NMF)is a recently popularized technique for learning partsbased,linear representations of non-negative data.The traditional NMF is optimized under the Gaussian noise or Poisson noise assumption,and hence not suitable if the data are grossly corrupted.To improve the robustness of NMF,a novel algorithm named robust nonnegative matrix factorization(RNMF)is proposed in this paper.We assume that some entries of the data matrix may be arbitrarily corrupted,but the corruption is sparse.RNMF decomposes the non-negative data matrix as the summation of one sparse error matrix and the product of two non-negative matrices.An efficient iterative approach is developed to solve the optimization problem of RNMF.We present experimental results on two face databases to verify the effectiveness of the proposed method.
文摘Background: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number of features increases. To address this limitation, feature elimination Random Forests was proposed that only uses features with the largest variable importance scores. Yet the performance of this method is not satisfying, possibly due to its rigid feature selection, and increased correlations between trees of forest. Methods: We propose variable importance-weighted Random Forests, which instead of sampling features with equal probability at each node to build up trees, samples features according to their variable importance scores, and then select the best split from the randomly selected features. Results: We evaluate the performance of our method through comprehensive simulation and real data analyses, for both regression and classification. Compared to the standard Random Forests and the feature elimination Random Forests methods, our proposed method has improved performance in most cases. Conclusions: By incorporating the variable importance scores into the random feature selection step, our method can better utilize more informative features without completely ignoring less informative ones, hence has improved prediction accuracy in the presence of weak signals and large noises. We have implemented an R package "viRandomForests" based on the original R package "randomForest" and it can be freely downloaded from http:// zhaocenter.org/software.
文摘In this article,we attempt to document a technical overview on modern miniature unmanned rotorcraft systems.We first give a brief review on the historical development of the rotorcraft unmanned aerial vehicles(UAVs),and then move on to present a fairly detailed and general overview on the hardware configuration,software integration,aerodynamic modeling and automatic flight control system involved in constructing the unmanned system.The applications of the emerging technology in the military and civilian domains are also highlighted.
基金the National Natural Science Foundation of China (No.61372138)the National Science and Technology Major Project of China (No.2018ZX10201002).
文摘Background:In recent years,since the molecular docking technique can greatly improve the efficiency and reduce the research cost,it has become a key tool in computer-assisted drug design to predict the binding affinity and analyze the interactive mode.Results:This study introduces the key principles,procedures and the widely-used applications for molecular docking.Also,it compares the commonly used docking applications and recommends which research areas are suitable for them.Lastly,it briefly reviews the latest progress in molecular docking such as the integrated method and deep learning.Conclusion:Limited to the incomplete molecular structure and the shortcomings of the scoring function,current docking applications are not accurate enough to predict the binding affinity.However,we could improve the current molecular docking technique by integrating the big biological data into scoring function.
文摘The specificity of protein-DNA interactions is most commonly modeled using position weight matrices (PWMs). First introduced in 1982, they have been adapted to many new types of data and many different approaches have been developed to determine the parameters of the PWM. New high-throughput technologies provide a large amount of data rapidly and offer an unprecedented opportunity to determine accurately the specificities of many transcription factors (TFs). But taking full advantage of the new data requires advanced algorithms that take into account the biophysical processes involved in generating the data. The new large datasets can also aid in determining when the PWM model is inadequate and must be extended to provide accurate predictions of binding sites. This article provides a general mathematical description of a PWM and how it is used to score potential binding sites, a brief history of the approaches that have been developed and the types of data that are used with an emphasis on algorithms that we have developed for analyzing high-throughput datasets from several new technologies. It also describes extensions that can be added when the simple PWM model is inadequate and further enhancements that may be necessary, it briefly describes some applications of PWMs in the discovery and modeling of in vivo regulatory networks.
基金supported by the Gansu Natural Science Foundation (No.ZS011-A25-016-G).
文摘A new adaptive mutation particle swarm optimizer,which is based on the variance of the population's fitness,is presented in this paper.During the running time,the mutation probability for the current best particle is determined by two factors:the variance of the population's fitness and the current optimal solution.The ability of particle swarm optimization(PSO)algorithm to break away from the local optimum is greatly improved by the mutation.The experimental results show that the new algorithm not only has great advantage of convergence property over genetic algorithm and PSO,but can also avoid the premature convergence problem effectively.