Ribonucleic acids(RNAs)play a vital role in biology,and knowledge of their three-dimensional(3D)structure is required to understand their biological functions.Recently structural prediction methods have been developed...Ribonucleic acids(RNAs)play a vital role in biology,and knowledge of their three-dimensional(3D)structure is required to understand their biological functions.Recently structural prediction methods have been developed to address this issue,but a series of RNA 3D structures are generally predicted by most existing methods.Therefore,the evaluation of the predicted structures is generally indispensable.Although several methods have been proposed to assess RNA 3D structures,the existing methods are not precise enough.In this work,a new all-atom knowledge-based potential is developed for more accurately evaluating RNA 3D structures.The potential not only includes local and nonlocal interactions but also fully considers the specificity of each RNA by introducing a retraining mechanism.Based on extensive test sets generated from independent methods,the proposed potential correctly distinguished the native state and ranked near-native conformations to effectively select the best.Furthermore,the proposed potential precisely captured RNA structural features such as base-stacking and base-pairing.Comparisons with existing potential methods show that the proposed potential is very reliable and accurate in RNA 3D structure evaluation.展开更多
单细胞转录组测序(scRNA-seq,single cell RNA sequencing)技术为单个细胞高通量、高分辨率的深入研究提供了机会,为在单细胞层面研究细胞功能及其背后的基因调控机制提供了重要技术手段。然而这项技术也带来新的挑战,单细胞数据具有规...单细胞转录组测序(scRNA-seq,single cell RNA sequencing)技术为单个细胞高通量、高分辨率的深入研究提供了机会,为在单细胞层面研究细胞功能及其背后的基因调控机制提供了重要技术手段。然而这项技术也带来新的挑战,单细胞数据具有规模大、噪声高、异构性强等特点,特别是高比例的数据缺失(dropout)严重影响了下游分析的可靠性,甚至掩盖了基因与基因间的重要关系。这里提出一种基于负二项分布的分治插补策略ND-Impute(Negative binomial distribution based Divide and conquer strategy for imputation)对scRNA-seq数据进行处理,该方法假设scRNA-seq数据符合负二项分布,利用包含特定损失函数的自动编码器获取数据的特异性参数,并使用分治策略估计潜在的基因表达值。通过聚类效果、相关性和误差分析等比较,表明该方法可以有效地恢复缺失数据,提高了后续研究分析的准确性。展开更多
基金Project supported by the National Science Foundation of China(Grants Nos.11605125,11105054,11274124,and 11401448)
文摘Ribonucleic acids(RNAs)play a vital role in biology,and knowledge of their three-dimensional(3D)structure is required to understand their biological functions.Recently structural prediction methods have been developed to address this issue,but a series of RNA 3D structures are generally predicted by most existing methods.Therefore,the evaluation of the predicted structures is generally indispensable.Although several methods have been proposed to assess RNA 3D structures,the existing methods are not precise enough.In this work,a new all-atom knowledge-based potential is developed for more accurately evaluating RNA 3D structures.The potential not only includes local and nonlocal interactions but also fully considers the specificity of each RNA by introducing a retraining mechanism.Based on extensive test sets generated from independent methods,the proposed potential correctly distinguished the native state and ranked near-native conformations to effectively select the best.Furthermore,the proposed potential precisely captured RNA structural features such as base-stacking and base-pairing.Comparisons with existing potential methods show that the proposed potential is very reliable and accurate in RNA 3D structure evaluation.
文摘单细胞转录组测序(scRNA-seq,single cell RNA sequencing)技术为单个细胞高通量、高分辨率的深入研究提供了机会,为在单细胞层面研究细胞功能及其背后的基因调控机制提供了重要技术手段。然而这项技术也带来新的挑战,单细胞数据具有规模大、噪声高、异构性强等特点,特别是高比例的数据缺失(dropout)严重影响了下游分析的可靠性,甚至掩盖了基因与基因间的重要关系。这里提出一种基于负二项分布的分治插补策略ND-Impute(Negative binomial distribution based Divide and conquer strategy for imputation)对scRNA-seq数据进行处理,该方法假设scRNA-seq数据符合负二项分布,利用包含特定损失函数的自动编码器获取数据的特异性参数,并使用分治策略估计潜在的基因表达值。通过聚类效果、相关性和误差分析等比较,表明该方法可以有效地恢复缺失数据,提高了后续研究分析的准确性。