摘要
在挖掘生物疾病数据的过程中经常受到强噪声扰动,会使得数据挖掘不准确,此时一些用于挖掘临界点信号的传统方法失效。本文中采取基于个体特定网络的概率分布嵌入的方法研究肝癌和前列腺癌两个时序列数据,检测疾病恶性突变的临界信号,进而预测疾病的突变点。这一工作的理论基础在于通过概率分布嵌入变换将原系统的样本状态的大噪声数据变成样本概率分布的小噪声数据,再建立个体特定网络。发现可以很好的降低数据受到的噪声干扰并且解决了样本数据少的问题。然后基于动态网络生物标志物来检测疾病突变的信号,最后对这些生物标志物进行功能分析,发现能够很好地反映临界信号。
In the process of mining biologicaldisease data, it is often disturbed by strong noise, which makes data mining inaccurate.At this time, some traditional methods for mining critical point signals are invalid.In this paper, we adopt the method of probability distribution embedding basedon individual specific network to study the two time series data of liver cancerand prostate cancer, detect the critical signal of malignant mutation ofdisease, and then predict the mutation point of disease. The theoretical basisof this work is to transform the big noise data of the sample state of the originalsystem into the small noise data of the sample probability distribution by theprobability distribution embedding transformation, and then establish theindividual specific network. It is found that it can reduce interference ofdata noise and solve the problem of fewer sample data. Then, based on thedynamic network biomarkers to detect the signal of disease mutation, thefunctional analysis of these biomarkers shows that they can well reflect thecritical signal.
出处
《计算生物学》
2018年第4期70-79,共10页
Hans Journal of Computational Biology