In this paper, a software bug classification algorithm, CLUBAS (Classification of Software Bugs Using Bug Attribute Similarity) is presented. CLUBAS is a hybrid algorithm, and is designed by using text clustering, fre...In this paper, a software bug classification algorithm, CLUBAS (Classification of Software Bugs Using Bug Attribute Similarity) is presented. CLUBAS is a hybrid algorithm, and is designed by using text clustering, frequent term calculations and taxonomic terms mapping techniques. The algorithm CLUBAS is an example of classification using clustering technique. The proposed algorithm works in three major steps, in the first step text clusters are created using software bug textual attributes data and followed by the second step in which cluster labels are generated using label induction for each cluster, and in the third step, the cluster labels are mapped against the bug taxonomic terms to identify the appropriate categories of the bug clusters. The cluster labels are generated using frequent and meaningful terms present in the bug attributes, for the bugs belonging to the bug clusters. The designed algorithm is evaluated using the performance parameters F-measures and accuracy. These parameters are compared with the standard classification techniques like Na?ve Bayes, Naive Bayes Multinomial, J48, Support Vector Machine and Weka’s classification using clustering algorithms. A GUI (Graphical User Interface) based tool is also developed in java for the implementation of CLUBAS algorithm.展开更多
文摘In this paper, a software bug classification algorithm, CLUBAS (Classification of Software Bugs Using Bug Attribute Similarity) is presented. CLUBAS is a hybrid algorithm, and is designed by using text clustering, frequent term calculations and taxonomic terms mapping techniques. The algorithm CLUBAS is an example of classification using clustering technique. The proposed algorithm works in three major steps, in the first step text clusters are created using software bug textual attributes data and followed by the second step in which cluster labels are generated using label induction for each cluster, and in the third step, the cluster labels are mapped against the bug taxonomic terms to identify the appropriate categories of the bug clusters. The cluster labels are generated using frequent and meaningful terms present in the bug attributes, for the bugs belonging to the bug clusters. The designed algorithm is evaluated using the performance parameters F-measures and accuracy. These parameters are compared with the standard classification techniques like Na?ve Bayes, Naive Bayes Multinomial, J48, Support Vector Machine and Weka’s classification using clustering algorithms. A GUI (Graphical User Interface) based tool is also developed in java for the implementation of CLUBAS algorithm.