摘要
Hierarchical clustering algorithm has been applied to identify the X-ray diffraction(XRD)patterns from a high-throughput characterization of the combinatorial materials chips.As data quality is usually correlated with acquisition time,it is important to study the hierarchical clustering performance as a function of data quality in order to optimize the efficiency of high-throughput experiments.This work investigated the effects of signal-to-noise ratio on the performance of hier-archical clustering using 29 distance metrics for the XRD patterns from Fe−Co−Ni ternary combinatorial materials chip.It is found that the clustering accuracies evaluated by the F1 score only fluctuate slightly with signal-to-noise ratio varying from 15.5 to 22.3(dB)under the experimental condition.This suggests that although it may take 40-50 s to collect a visually high-quality diffraction pattern,the measurement time could be significantly reduced to as low as 4 s without substantial loss in phase identification accuracy by hierarchical clustering.Among the 29 distance metrics,Pearsonχ^(2)shows the highest mean F1 score of 0.77 and lowest standard deviation of 0.008.It shows that the distance matrixes calculated by Pearsonχ^(2)are mainly controlled by the XRD peak shifting characteristics and visualized by the metric multidimensional data scaling.
基金
funded by the National Key Research and Development Program of China(Grant Nos.2021YFB370-2102 and 2017YFB0701900).