Urban sewer pipes are a vital infrastructure in modern cities,and their defects must be detected in time to prevent potential malfunctioning.In recent years,to relieve the manual efforts by human experts,models based ...Urban sewer pipes are a vital infrastructure in modern cities,and their defects must be detected in time to prevent potential malfunctioning.In recent years,to relieve the manual efforts by human experts,models based on deep learning have been introduced to automatically identify potential defects.However,these models are insufficient in terms of dataset complexity,model versatility and performance.Our work addresses these issues with amulti-stage defect detection architecture using a composite backbone Swin Transformer.Themodel based on this architecture is trained using a more comprehensive dataset containingmore classes of defects.By ablation studies on the modules of combined backbone Swin Transformer,multi-stage detector,test-time data augmentation and model fusion,it is revealed that they all contribute to the improvement of detection accuracy from different aspects.The model incorporating all these modules achieves the mean Average Precision(mAP)of 78.6% at an Intersection over Union(IoU)threshold of 0.5.This represents an improvement of 14.1% over the ResNet50 Faster Region-based Convolutional Neural Network(R-CNN)model and a 6.7% improvement over You Only Look Once version 6(YOLOv6)-large,the highest in the YOLO methods.In addition,for other defect detection models for sewer pipes,although direct comparison with themis infeasible due to the unavailability of their private datasets,our results are obtained from a more comprehensive dataset and have superior generalization capabilities.展开更多
基金supported by the Science and Technology Development Fund of Macao(Grant No.0079/2019/AMJ)the National Key R&D Program of China(No.2019YFE0111400).
文摘Urban sewer pipes are a vital infrastructure in modern cities,and their defects must be detected in time to prevent potential malfunctioning.In recent years,to relieve the manual efforts by human experts,models based on deep learning have been introduced to automatically identify potential defects.However,these models are insufficient in terms of dataset complexity,model versatility and performance.Our work addresses these issues with amulti-stage defect detection architecture using a composite backbone Swin Transformer.Themodel based on this architecture is trained using a more comprehensive dataset containingmore classes of defects.By ablation studies on the modules of combined backbone Swin Transformer,multi-stage detector,test-time data augmentation and model fusion,it is revealed that they all contribute to the improvement of detection accuracy from different aspects.The model incorporating all these modules achieves the mean Average Precision(mAP)of 78.6% at an Intersection over Union(IoU)threshold of 0.5.This represents an improvement of 14.1% over the ResNet50 Faster Region-based Convolutional Neural Network(R-CNN)model and a 6.7% improvement over You Only Look Once version 6(YOLOv6)-large,the highest in the YOLO methods.In addition,for other defect detection models for sewer pipes,although direct comparison with themis infeasible due to the unavailability of their private datasets,our results are obtained from a more comprehensive dataset and have superior generalization capabilities.