Code smell detection is essential to improve software quality, enhancing software maintainability, and decrease the risk of faults and failures in the software system. In this paper, we proposed a code smell predictio...Code smell detection is essential to improve software quality, enhancing software maintainability, and decrease the risk of faults and failures in the software system. In this paper, we proposed a code smell prediction approach based on machine learning techniques and software metrics. The local interpretable model-agnostic explanations (LIME) algorithm was further used to explain the machine learning model's predictions and interpretability. The datasets obtained from Fontana et al. were reformed and used to build binary-label and multi-label datasets. The results of 10-fold cross-validation show that the performance of tree-based algorithms (mainly Random Forest) is higher compared with kernel-based and network-based algorithms. The genetic algorithm based feature selection methods enhance the accuracy of these machine learning algorithms by selecting the most relevant features in each dataset. Moreover, the parameter optimization techniques based on the grid search algorithm significantly enhance the accuracy of all these algorithms. Finally, machine learning techniques have high potential in predicting the code smells, which contribute to detect these smells and enhance the software's quality.展开更多
Ultrasonic testing(UT)is increasingly combined with machine learning(ML)techniques for intelligently identifying damage.Extracting signifcant features from UT data is essential for efcient defect characterization.More...Ultrasonic testing(UT)is increasingly combined with machine learning(ML)techniques for intelligently identifying damage.Extracting signifcant features from UT data is essential for efcient defect characterization.Moreover,the hidden physics behind ML is unexplained,reducing the generalization capability and versatility of ML methods in UT.In this paper,a generally applicable ML framework based on the model interpretation strategy is proposed to improve the detection accuracy and computational efciency of UT.Firstly,multi-domain features are extracted from the UT signals with signal processing techniques to construct an initial feature space.Subsequently,a feature selection method based on model interpretable strategy(FS-MIS)is innovatively developed by integrating Shapley additive explanation(SHAP),flter method,embedded method and wrapper method.The most efective ML model and the optimal feature subset with better correlation to the target defects are determined self-adaptively.The proposed framework is validated by identifying and locating side-drilled holes(SDHs)with 0.5λcentral distance and different depths.An ultrasonic array probe is adopted to acquire FMC datasets from several aluminum alloy specimens containing two SDHs by experiments.The optimal feature subset selected by FS-MIS is set as the input of the chosen ML model to train and predict the times of arrival(ToAs)of the scattered waves emitted by adjacent SDHs.The experimental results demonstrate that the relative errors of the predicted ToAs are all below 3.67%with an average error of 0.25%,signifcantly improving the time resolution of UT signals.On this basis,the predicted ToAs are assigned to the corresponding original signals for decoupling overlapped pulse-echoes and reconstructing high-resolution FMC datasets.The imaging resolution is enhanced to 0.5λby implementing the total focusing method(TFM).The relative errors of hole depths and central distance are no more than 0.51%and 3.57%,respectively.Finally,the superior performance of the proposed展开更多
文摘Code smell detection is essential to improve software quality, enhancing software maintainability, and decrease the risk of faults and failures in the software system. In this paper, we proposed a code smell prediction approach based on machine learning techniques and software metrics. The local interpretable model-agnostic explanations (LIME) algorithm was further used to explain the machine learning model's predictions and interpretability. The datasets obtained from Fontana et al. were reformed and used to build binary-label and multi-label datasets. The results of 10-fold cross-validation show that the performance of tree-based algorithms (mainly Random Forest) is higher compared with kernel-based and network-based algorithms. The genetic algorithm based feature selection methods enhance the accuracy of these machine learning algorithms by selecting the most relevant features in each dataset. Moreover, the parameter optimization techniques based on the grid search algorithm significantly enhance the accuracy of all these algorithms. Finally, machine learning techniques have high potential in predicting the code smells, which contribute to detect these smells and enhance the software's quality.
基金Supported by National Natural Science Foundation of China(Grant Nos.U22B2068,52275520,52075078)National Key Research and Development Program of China(Grant No.2019YFA0709003).
文摘Ultrasonic testing(UT)is increasingly combined with machine learning(ML)techniques for intelligently identifying damage.Extracting signifcant features from UT data is essential for efcient defect characterization.Moreover,the hidden physics behind ML is unexplained,reducing the generalization capability and versatility of ML methods in UT.In this paper,a generally applicable ML framework based on the model interpretation strategy is proposed to improve the detection accuracy and computational efciency of UT.Firstly,multi-domain features are extracted from the UT signals with signal processing techniques to construct an initial feature space.Subsequently,a feature selection method based on model interpretable strategy(FS-MIS)is innovatively developed by integrating Shapley additive explanation(SHAP),flter method,embedded method and wrapper method.The most efective ML model and the optimal feature subset with better correlation to the target defects are determined self-adaptively.The proposed framework is validated by identifying and locating side-drilled holes(SDHs)with 0.5λcentral distance and different depths.An ultrasonic array probe is adopted to acquire FMC datasets from several aluminum alloy specimens containing two SDHs by experiments.The optimal feature subset selected by FS-MIS is set as the input of the chosen ML model to train and predict the times of arrival(ToAs)of the scattered waves emitted by adjacent SDHs.The experimental results demonstrate that the relative errors of the predicted ToAs are all below 3.67%with an average error of 0.25%,signifcantly improving the time resolution of UT signals.On this basis,the predicted ToAs are assigned to the corresponding original signals for decoupling overlapped pulse-echoes and reconstructing high-resolution FMC datasets.The imaging resolution is enhanced to 0.5λby implementing the total focusing method(TFM).The relative errors of hole depths and central distance are no more than 0.51%and 3.57%,respectively.Finally,the superior performance of the proposed