期刊文献+
共找到3篇文章
< 1 >
每页显示 20 50 100
Application of Graph Entropy in CRISPR and Repeats Detection in DNA Sequences
1
作者 dipendra c. sengupta Jharna D. sengupta 《Computational Molecular Bioscience》 2016年第3期41-51,共11页
We analyzed DNA sequences using a new measure of entropy. The general aim was to analyze DNA sequences and find interesting sections of a genome using a new formulation of Shannon like entropy. We developed this new m... We analyzed DNA sequences using a new measure of entropy. The general aim was to analyze DNA sequences and find interesting sections of a genome using a new formulation of Shannon like entropy. We developed this new measure of entropy for any non-trivial graph or, more broadly, for any square matrix whose non-zero elements represent probabilistic weights assigned to connections or transitions between pairs of vertices. The new measure is called the graph entropy and it quantifies the aggregate indeterminacy effected by the variety of unique walks that exist between each pair of vertices. The new tool is shown to be uniquely capable of revealing CRISPR regions in bacterial genomes and to identify Tandem repeats and Direct repeats of genome. We have done experiment on 26 species and found many tandem repeats and direct repeats (CRISPR for bacteria or archaea). There are several existing separate CRISPR or Tandem finder tools but our entropy can find both of these features if present in genome. 展开更多
关键词 CRISPR Graph Entropy Tandem Repeats DNA Sequences
下载PDF
Similarity Studies of Corona Viruses through Chaos Game Representation 被引量:1
2
作者 dipendra c. sengupta Matthew D. Hill +1 位作者 Kevin R. Benton Hirendra N. Banerjee 《Computational Molecular Bioscience》 2020年第3期61-72,共12页
The novel coronavirus (SARS-COV-2) is generally referred to as Covid-19 virus has spread to 213 countries with nearly 7 million confirmed cases and nearly 400,000 deaths. Such major outbreaks demand classification and... The novel coronavirus (SARS-COV-2) is generally referred to as Covid-19 virus has spread to 213 countries with nearly 7 million confirmed cases and nearly 400,000 deaths. Such major outbreaks demand classification and origin of the virus genomic sequence, for planning, containment, and treatment. Motivated by the above need, we report two alignment-free methods combing with CGR to perform clustering analysis and create a phylogenetic tree based on it. To each DNA sequence we associate a matrix then define distance between two DNA sequences to be the distance between their associated matrix. These methods are being used for phylogenetic analysis of coronavirus sequences. Our approach provides a powerful tool for analyzing and annotating genomes and their phylogenetic relationships. We also compare our tool to ClustalX algorithm which is one of the most popular alignment methods. Our alignment-free methods are shown to be capable of finding closest genetic relatives of coronaviruses. 展开更多
关键词 Covid-19 Chaos Game Representation Deoxyribonucleic Acid Phylogenetic Analysis Shannon Entropy
下载PDF
Evolutionary Relationship of Protein Sequences of SARS-CoV-2 and Other Viruses through Chaos Game Representation
3
作者 Matthew D. Hill Kevin E. Simmons dipendra c. sengupta 《Computational Molecular Bioscience》 CAS 2022年第3期123-143,共21页
Comparison between different biological sequences is a key step in bioinformatics when analyzing similarities of sequences and phylogenetic relationships. A method of graphically representing biological sequences know... Comparison between different biological sequences is a key step in bioinformatics when analyzing similarities of sequences and phylogenetic relationships. A method of graphically representing biological sequences known as Chaos Game Representation (CGR) has achieved many applications in the studies of bioinformatics. The key issue in the application of CGR is to extract as many useful features as possible from CGR. Initially, CGR was applied to DNA sequences, but in this paper, a CGR-based approach is used to extract suitable features for comparing protein sequences of SARS-CoV-2 and other viruses. For this aim, several viral protein sequences from 12 groups are considered and CGR centroid, amino acid frequency, compounded frequency, Shannon entropy, and Kullback-Lieber Discrimination Information are applied to find the inter-relationship among the sequences. The experimental results demonstrate the potential strengths of CGR-based method for examining the evolutionary relationship of protein sequences. Our method is powerful for extracting effective features from protein sequences, and therefore important in classifying proteins and inferring the phylogeny of viruses. 展开更多
关键词 Chaos Game Representation (CGR) PROTEIN Multi-Dimensional Scaling (MDS)
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部