期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Verbumculus and the Discovery of Unusual Words 被引量:1
1
作者 AlbertoApostolico fang-chenggong StefanoLonardi 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第1期22-41,共20页
Measures relating word frequencies and expectations have been constantly ofinterest in Bioinformatics studies. With sequence data becoming massively available, exhaustiveenumeration of such measures have become concei... Measures relating word frequencies and expectations have been constantly ofinterest in Bioinformatics studies. With sequence data becoming massively available, exhaustiveenumeration of such measures have become conceivable, and yet pose significant computational burdeneven when limited to words of bounded maximum length. In addition, the display of the huge tablespossibly resulting from these counts poses practical problems of visualization and inference.VERBUMCULUS is a suite of software tools for the efficient and fast detection of over- orunder-represented words in nucleotide sequences. The inner core of VERBUMCULUS rests on subtlyinterwoven properties of statistics, pattern matching and combinatorics on words, that enable one tolimit drastically and a priori the set of over-or under-represented candidate words of all lengthsin a given sequence, thereby rendering it more feasible both to detect and visualize such words in afast and practically useful way. This paper is devoted to the description of the facility at theoutset and to report experimental results, ranging from simulations on synthetic data to thediscovery of regulatory elements on the upstream regions of a set of genes of the yeast. 展开更多
关键词 verbumculus unusual words subword statistics pattern discovery regulatoryelements suffix trees
原文传递
Verbumculus和非常用词汇的发现
2
作者 AlbertoApostolico fang-chenggong StefanoLonardi 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第C00期8-8,共1页
测量相关词汇的频率和期望值一直都是生物信息学研究的兴趣所在。随着可以获得的序列信息大量增加,人们处理诸如相关词汇的生物学数据的能力正逐步提高,同时计算负担也明显加大,即使限定词汇的边界及最大长度也是如此。此外,大规模... 测量相关词汇的频率和期望值一直都是生物信息学研究的兴趣所在。随着可以获得的序列信息大量增加,人们处理诸如相关词汇的生物学数据的能力正逐步提高,同时计算负担也明显加大,即使限定词汇的边界及最大长度也是如此。此外,大规模计算所产生的巨大表格还存在一些可视化和推论方面的实际问题。 展开更多
关键词 Verbumculus 非常用词汇 生物信息学 基因组
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部