摘要
RNA secondary structure has become the most exploitable feature for ab initio detection of non-coding RNA(nc RNA) genes from genome sequences. Previous work has used Minimum Free Energy(MFE) based methods developed to identify nc RNAs by measuring sequence fold stability and certainty. However, these methods yielded variable performances across different nc RNA species. Designing novel reliable structural measures will help to develop effective nc RNA gene finding tools. This paper introduces a new RNA structural measure based on a novel RNA secondary structure ensemble constrained by characteristics of native RNA tertiary structures. The new method makes it possible to achieve a performance leap from the previous structure-based methods. Test results on standard nc RNA datasets(benchmarks) demonstrate that this method can effectively separate most nc RNAs families from genome backgrounds.
RNA secondary structure has become the most exploitable feature for ab initio detection of non-coding RNA(nc RNA) genes from genome sequences. Previous work has used Minimum Free Energy(MFE) based methods developed to identify nc RNAs by measuring sequence fold stability and certainty. However, these methods yielded variable performances across different nc RNA species. Designing novel reliable structural measures will help to develop effective nc RNA gene finding tools. This paper introduces a new RNA structural measure based on a novel RNA secondary structure ensemble constrained by characteristics of native RNA tertiary structures. The new method makes it possible to achieve a performance leap from the previous structure-based methods. Test results on standard nc RNA datasets(benchmarks) demonstrate that this method can effectively separate most nc RNAs families from genome backgrounds.
基金
supported in part by NSF MRI 0821263
NIH BISTI R01GM072080-01A1 grant
NIH ARRA Administrative Supplement to NIH BISTI R01GM072080-01A1
NSF IIS grant of award No 0916250