As we enter the year of 2011, the 2009 H1N1 pandemic influenza virus is in the news again. At least 20 people have died of this virus in China since the beginning of 2011 and it is now the predominant flu strain in th...As we enter the year of 2011, the 2009 H1N1 pandemic influenza virus is in the news again. At least 20 people have died of this virus in China since the beginning of 2011 and it is now the predominant flu strain in the country. Although this novel virus was quite stable during its run in the flu season of 2009-2010, a genetic variant of this virus was found in Singapore in early 2010, and then in Australia and New Zealand during their 2010 winter influenza season. Several critical mutations in the HA protein of this variant were uncovered in the strains collected from January 2010 to April 2010. Moreover, a structural homology model of HA from the A/Brisbane/10/2010(H1N1) strain was made based on the structure of A/California/04/2009 (H1N1). The purpose of this study was to investigate mutations in the HA protein of 2009 H1N1 from sequence data collected worldwide from May 2010 to February 2011. A fundamental problem in bioinformatics and biology is to find the similar gene sequences for a given gene sequence of interest. Here we proposed the inverse problem, i.e., finding the exemplars from a group of related gene sequences. With a clustering algorithm affinity propagation, six exemplars of the HA sequences were identified to represent six clusters. One of the clusters contained strain A/Brisbane/12/2010(H1N1) that only differed from A/Brisbane/10/2010 in the HA sequence at position 449. Based on the sequence identity of the six exemplars, nine mutations in HA were located that could be used to distinguish these six clusters. Finally, we discovered the change of correlation patterns for the HA and NA of 2009 H1N1 as a result of the HA receptor binding specificity switch, revealing the balanced interplay between these two surface proteins of the virus.展开更多
Affinity propagation(AP)is a widely used exemplar-based clustering approach with superior efficiency and clustering quality.Nevertheless,a common issue with AP clustering is the presence of excessive exemplars,which l...Affinity propagation(AP)is a widely used exemplar-based clustering approach with superior efficiency and clustering quality.Nevertheless,a common issue with AP clustering is the presence of excessive exemplars,which limits its ability to perform effective aggregation.This research aims to enable AP to automatically aggregate to produce fewer and more compact clusters,without changing the similarity matrix or customizing preference parameters,as done in existing enhanced approaches.An automatic aggregation enhanced affinity propagation(AAEAP)clustering algorithm is proposed,which combines a dependable partitioning clustering approach with AP to achieve this purpose.The partitioning clustering approach generates an additional set of findings with an equivalent number of clusters whenever the clustering stabilizes and the exemplars emerge.Based on these findings,mutually exclusive exemplar detection was conducted on the current AP exemplars,and a pair of unsuitable exemplars for coexistence is recommended.The recommendation is then mapped as a novel constraint,designated mutual exclusion and aggregation.To address this limitation,a modified AP clustering model is derived and the clustering is restarted,which can result in exemplar number reduction,exemplar selection adjustment,and other data point redistribution.The clustering is ultimately completed and a smaller number of clusters are obtained by repeatedly performing automatic detection and clustering until no mutually exclusive exemplars are detected.Some standard classification data sets are adopted for experiments on AAEAP and other clustering algorithms for comparison,and many internal and external clustering evaluation indexes are used to measure the clustering performance.The findings demonstrate that the AAEAP clustering algorithm demonstrates a substantial automatic aggregation impact while maintaining good clustering quality.展开更多
文摘As we enter the year of 2011, the 2009 H1N1 pandemic influenza virus is in the news again. At least 20 people have died of this virus in China since the beginning of 2011 and it is now the predominant flu strain in the country. Although this novel virus was quite stable during its run in the flu season of 2009-2010, a genetic variant of this virus was found in Singapore in early 2010, and then in Australia and New Zealand during their 2010 winter influenza season. Several critical mutations in the HA protein of this variant were uncovered in the strains collected from January 2010 to April 2010. Moreover, a structural homology model of HA from the A/Brisbane/10/2010(H1N1) strain was made based on the structure of A/California/04/2009 (H1N1). The purpose of this study was to investigate mutations in the HA protein of 2009 H1N1 from sequence data collected worldwide from May 2010 to February 2011. A fundamental problem in bioinformatics and biology is to find the similar gene sequences for a given gene sequence of interest. Here we proposed the inverse problem, i.e., finding the exemplars from a group of related gene sequences. With a clustering algorithm affinity propagation, six exemplars of the HA sequences were identified to represent six clusters. One of the clusters contained strain A/Brisbane/12/2010(H1N1) that only differed from A/Brisbane/10/2010 in the HA sequence at position 449. Based on the sequence identity of the six exemplars, nine mutations in HA were located that could be used to distinguish these six clusters. Finally, we discovered the change of correlation patterns for the HA and NA of 2009 H1N1 as a result of the HA receptor binding specificity switch, revealing the balanced interplay between these two surface proteins of the virus.
基金supported by Research Team Development Funds of L.Xue and Z.H.Ouyang,Electronic Countermeasure Institute,National University of Defense Technology。
文摘Affinity propagation(AP)is a widely used exemplar-based clustering approach with superior efficiency and clustering quality.Nevertheless,a common issue with AP clustering is the presence of excessive exemplars,which limits its ability to perform effective aggregation.This research aims to enable AP to automatically aggregate to produce fewer and more compact clusters,without changing the similarity matrix or customizing preference parameters,as done in existing enhanced approaches.An automatic aggregation enhanced affinity propagation(AAEAP)clustering algorithm is proposed,which combines a dependable partitioning clustering approach with AP to achieve this purpose.The partitioning clustering approach generates an additional set of findings with an equivalent number of clusters whenever the clustering stabilizes and the exemplars emerge.Based on these findings,mutually exclusive exemplar detection was conducted on the current AP exemplars,and a pair of unsuitable exemplars for coexistence is recommended.The recommendation is then mapped as a novel constraint,designated mutual exclusion and aggregation.To address this limitation,a modified AP clustering model is derived and the clustering is restarted,which can result in exemplar number reduction,exemplar selection adjustment,and other data point redistribution.The clustering is ultimately completed and a smaller number of clusters are obtained by repeatedly performing automatic detection and clustering until no mutually exclusive exemplars are detected.Some standard classification data sets are adopted for experiments on AAEAP and other clustering algorithms for comparison,and many internal and external clustering evaluation indexes are used to measure the clustering performance.The findings demonstrate that the AAEAP clustering algorithm demonstrates a substantial automatic aggregation impact while maintaining good clustering quality.