In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convo...In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convolutional neural network(Faster R-CNN)is proposed.The bird’s nest on the high-voltage tower is taken as the research object.Firstly,we use the improved convolutional neural network ResNet101 to extract object features,and then use multi-scale sliding windows to obtain the object region proposals on the convolution feature maps with different resolutions.Finally,a deconvolution operation is added to further enhance the selected feature map with higher resolution,and then it taken as a feature mapping layer of the region proposals passing to the object detection sub-network.The detection results of the bird’s nest in UAV aerial images show that the proposed method can precisely detect small objects in aerial images.展开更多
Breast cancer has become a killer of women's health nowadays.In order to exploit the potential representational capabilities of the models more comprehensively,we propose a multi-model fusion strategy.Specifically...Breast cancer has become a killer of women's health nowadays.In order to exploit the potential representational capabilities of the models more comprehensively,we propose a multi-model fusion strategy.Specifically,we combine two differently structured deep learning models,ResNet101 and Swin Transformer(SwinT),with the addition of the Convolutional Block Attention Module(CBAM)attention mechanism,which makes full use of SwinT's global context information modeling ability and ResNet101's local feature extraction ability,and additionally the cross entropy loss function is replaced by the focus loss function to solve the problem of unbalanced allocation of breast cancer data sets.The multi-classification recognition accuracies of the proposed fusion model under 40X,100X,200X and 400X BreakHis datasets are 97.50%,96.60%,96.30 and 96.10%,respectively.Compared with a single SwinT model and ResNet 101 model,the fusion model has higher accuracy and better generalization ability,which provides a more effective method for screening,diagnosis and pathological classification of female breast cancer.展开更多
When we use traditional computer vision Inspection technology to locate the vehicles,we find that the results were unsatisfactory,because of the existence of diversified scenes and uncertainty.So,we present a new meth...When we use traditional computer vision Inspection technology to locate the vehicles,we find that the results were unsatisfactory,because of the existence of diversified scenes and uncertainty.So,we present a new method based on improved SSD model.We adopt ResNet101 to enhance the feature extraction ability of algorithm model instead of the VGG16 used by the classic model.Meanwhile,the new method optimizes the loss function,such as the loss function of predicted offset,and makes the loss function drop more smoothly near zero points.In addition,the new method improves cross entropy loss function of category prediction,decreases the loss when the probability of positive prediction is high effectively,and increases the speed of training.In this paper,VOC2012 data set is used for experiment.The results show that this method improves average accuracy of detection and reduces the training time of the model.展开更多
Nowadays,action recognition is widely applied in many fields.However,action is hard to define by single modality information.The difference between image recognition and action recognition is that action recognition n...Nowadays,action recognition is widely applied in many fields.However,action is hard to define by single modality information.The difference between image recognition and action recognition is that action recognition needs more modality information to depict one action,such as the appearance,the motion and the dynamic information.Due to the state of action evolves with the change of time,motion information must be considered when representing an action.Most of current methods define an action by spatial information and motion information.There are two key elements of current action recognition methods:spatial information achieved by sampling sparsely on video frames’sequence and the motion content mostly represented by the optical flow which is calculated on consecutive video frames.However,the relevance between them in current methods is weak.Therefore,to strengthen the associativity,this paper presents a new architecture consisted of three streams to obtain multi-modality information.The advantages of our network are:(a)We propose a new sampling approach to sample evenly on the video sequence for acquiring the appearance information;(b)We utilize ResNet101 for gaining high-level and distinguished features;(c)We advance a three-stream architecture to capture temporal,spatial and dynamic information.Experimental results on UCF101 dataset illustrate that our method outperforms other previous methods.展开更多
Liver cancer is the second leading cause of cancer death worldwide.Early tumor detection may help identify suitable treatment and increase the survival rate.Medical imaging is a non-invasive tool that can help uncover...Liver cancer is the second leading cause of cancer death worldwide.Early tumor detection may help identify suitable treatment and increase the survival rate.Medical imaging is a non-invasive tool that can help uncover abnormalities in human organs.Magnetic Resonance Imaging(MRI),in particular,uses magnetic fields and radio waves to differentiate internal human organs tissue.However,the interpretation of medical images requires the subjective expertise of a radiologist and oncologist.Thus,building an automated diagnosis computer-based system can help specialists reduce incorrect diagnoses.This paper proposes a hybrid automated system to compare the performance of 3D features and 2D features in classifying magnetic resonance liver tumor images.This paper proposed two models;the first one employed the 3D features while the second exploited the 2D features.The first system uses 3D texture attributes,3D shape features,and 3D graphical deep descriptors beside an ensemble classifier to differentiate between four 3D tumor categories.On top of that,the proposed method is applied to 2D slices for comparison purposes.The proposed approach attained 100%accuracy in discriminating between all types of tumors,100%Area Under the Curve(AUC),100%sensitivity,and 100%specificity and precision as well in 3D liver tumors.On the other hand,the performance is lower in 2D classification.The maximum accuracy reached 96.4%for two classes and 92.1%for four classes.The top-class performance of the proposed system can be attributed to the exploitation of various types of feature selection methods besides utilizing the ReliefF features selection technique to choose the most relevant features associated with different classes.The novelty of this work appeared in building a highly accurate system under specific circumstances without any processing for the images and human input,besides comparing the performance between 2D and 3D classification.In the future,the presented work can be extended to be used in the huge dataset.Then,it can be a reliab展开更多
This study aims to detect and prevent greening disease in citrus trees using a deep neural network.The process of collecting data on citrus greening disease is very difficult because the vector pests are too small.In ...This study aims to detect and prevent greening disease in citrus trees using a deep neural network.The process of collecting data on citrus greening disease is very difficult because the vector pests are too small.In this paper,since the amount of data collected for deep learning is insufficient,we intend to use the efficient feature extraction function of the neural network based on the Transformer algorithm.We want to use the Cascade Region-based Convolutional Neural Networks(Cascade R-CNN)Swin model,which is a mixture of the transformer model and Cascade R-CNN model to detect greening disease occurring in citrus.In this paper,we try to improve model safety by establishing a linear relationship between samples using Mixup and Cutmix algorithms,which are image processing-based data augmentation techniques.In addition,by using the ImageNet dataset,transfer learning,and stochastic weight averaging(SWA)methods,more accuracy can be obtained.This study compared the Faster Region-based Convolutional Neural Networks Residual Network101(Faster R-CNN ResNet101)model,Cascade Regionbased Convolutional Neural Networks Residual Network101(Cascade RCNN-ResNet101)model,and Cascade R-CNN Swin Model.As a result,the Faster R-CNN ResNet101 model came out as Average Precision(AP)(Intersection over Union(IoU)=0.5):88.2%,AP(IoU=0.75):62.8%,Recall:68.2%,and the Cascade R-CNN ResNet101 model was AP(IoU=0.5):91.5%,AP(IoU=0.75):67.2%,Recall:73.1%.Alternatively,the Cascade R-CNN Swin Model showed AP(IoU=0.5):94.9%,AP(IoU=0.75):79.8%and Recall:76.5%.Thus,the Cascade R-CNN Swin Model showed the best results for detecting citrus greening disease.展开更多
基金North Dakota Agricultural Experiment Station Precision Agriculture Graduate Research Assistantship(6064-21660-001-32S)USDAAgricultural Research Service Project(435589)。
基金National Defense Pre-research Fund Project(No.KMGY318002531)。
文摘In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convolutional neural network(Faster R-CNN)is proposed.The bird’s nest on the high-voltage tower is taken as the research object.Firstly,we use the improved convolutional neural network ResNet101 to extract object features,and then use multi-scale sliding windows to obtain the object region proposals on the convolution feature maps with different resolutions.Finally,a deconvolution operation is added to further enhance the selected feature map with higher resolution,and then it taken as a feature mapping layer of the region proposals passing to the object detection sub-network.The detection results of the bird’s nest in UAV aerial images show that the proposed method can precisely detect small objects in aerial images.
基金By the National Natural Science Foundation of China(NSFC)(No.61772358),the National Key R&D Program Funded Project(No.2021YFE0105500),and the Jiangsu University‘Blue Project’.
文摘Breast cancer has become a killer of women's health nowadays.In order to exploit the potential representational capabilities of the models more comprehensively,we propose a multi-model fusion strategy.Specifically,we combine two differently structured deep learning models,ResNet101 and Swin Transformer(SwinT),with the addition of the Convolutional Block Attention Module(CBAM)attention mechanism,which makes full use of SwinT's global context information modeling ability and ResNet101's local feature extraction ability,and additionally the cross entropy loss function is replaced by the focus loss function to solve the problem of unbalanced allocation of breast cancer data sets.The multi-classification recognition accuracies of the proposed fusion model under 40X,100X,200X and 400X BreakHis datasets are 97.50%,96.60%,96.30 and 96.10%,respectively.Compared with a single SwinT model and ResNet 101 model,the fusion model has higher accuracy and better generalization ability,which provides a more effective method for screening,diagnosis and pathological classification of female breast cancer.
基金supported in part by National Natural Science Fund of China (61806088, 61902160)Qing Lan Project of Jiangsu Province and Natural Science Foundation of Jiangsu Province (BK20160293)Changzhou Science and Technology Support Plan (CE20185044).
文摘When we use traditional computer vision Inspection technology to locate the vehicles,we find that the results were unsatisfactory,because of the existence of diversified scenes and uncertainty.So,we present a new method based on improved SSD model.We adopt ResNet101 to enhance the feature extraction ability of algorithm model instead of the VGG16 used by the classic model.Meanwhile,the new method optimizes the loss function,such as the loss function of predicted offset,and makes the loss function drop more smoothly near zero points.In addition,the new method improves cross entropy loss function of category prediction,decreases the loss when the probability of positive prediction is high effectively,and increases the speed of training.In this paper,VOC2012 data set is used for experiment.The results show that this method improves average accuracy of detection and reduces the training time of the model.
基金the National Natural Science Foundation of China(Nos.61672150,61907007)by the Fund of the Jilin Provincial Science and Technology Department Project(Nos.20180201089GX,20190201305JC)+1 种基金Provincial Department of Education Project(Nos.JJKH20190291KJ,JJKH20190294KJ,JJKH20190355KJ)the Fundamental Research Funds for the Central Universities(No.2412019FZ049).
文摘Nowadays,action recognition is widely applied in many fields.However,action is hard to define by single modality information.The difference between image recognition and action recognition is that action recognition needs more modality information to depict one action,such as the appearance,the motion and the dynamic information.Due to the state of action evolves with the change of time,motion information must be considered when representing an action.Most of current methods define an action by spatial information and motion information.There are two key elements of current action recognition methods:spatial information achieved by sampling sparsely on video frames’sequence and the motion content mostly represented by the optical flow which is calculated on consecutive video frames.However,the relevance between them in current methods is weak.Therefore,to strengthen the associativity,this paper presents a new architecture consisted of three streams to obtain multi-modality information.The advantages of our network are:(a)We propose a new sampling approach to sample evenly on the video sequence for acquiring the appearance information;(b)We utilize ResNet101 for gaining high-level and distinguished features;(c)We advance a three-stream architecture to capture temporal,spatial and dynamic information.Experimental results on UCF101 dataset illustrate that our method outperforms other previous methods.
文摘Liver cancer is the second leading cause of cancer death worldwide.Early tumor detection may help identify suitable treatment and increase the survival rate.Medical imaging is a non-invasive tool that can help uncover abnormalities in human organs.Magnetic Resonance Imaging(MRI),in particular,uses magnetic fields and radio waves to differentiate internal human organs tissue.However,the interpretation of medical images requires the subjective expertise of a radiologist and oncologist.Thus,building an automated diagnosis computer-based system can help specialists reduce incorrect diagnoses.This paper proposes a hybrid automated system to compare the performance of 3D features and 2D features in classifying magnetic resonance liver tumor images.This paper proposed two models;the first one employed the 3D features while the second exploited the 2D features.The first system uses 3D texture attributes,3D shape features,and 3D graphical deep descriptors beside an ensemble classifier to differentiate between four 3D tumor categories.On top of that,the proposed method is applied to 2D slices for comparison purposes.The proposed approach attained 100%accuracy in discriminating between all types of tumors,100%Area Under the Curve(AUC),100%sensitivity,and 100%specificity and precision as well in 3D liver tumors.On the other hand,the performance is lower in 2D classification.The maximum accuracy reached 96.4%for two classes and 92.1%for four classes.The top-class performance of the proposed system can be attributed to the exploitation of various types of feature selection methods besides utilizing the ReliefF features selection technique to choose the most relevant features associated with different classes.The novelty of this work appeared in building a highly accurate system under specific circumstances without any processing for the images and human input,besides comparing the performance between 2D and 3D classification.In the future,the presented work can be extended to be used in the huge dataset.Then,it can be a reliab
基金This research was supported by the Honam University Research Fund,2021.
文摘This study aims to detect and prevent greening disease in citrus trees using a deep neural network.The process of collecting data on citrus greening disease is very difficult because the vector pests are too small.In this paper,since the amount of data collected for deep learning is insufficient,we intend to use the efficient feature extraction function of the neural network based on the Transformer algorithm.We want to use the Cascade Region-based Convolutional Neural Networks(Cascade R-CNN)Swin model,which is a mixture of the transformer model and Cascade R-CNN model to detect greening disease occurring in citrus.In this paper,we try to improve model safety by establishing a linear relationship between samples using Mixup and Cutmix algorithms,which are image processing-based data augmentation techniques.In addition,by using the ImageNet dataset,transfer learning,and stochastic weight averaging(SWA)methods,more accuracy can be obtained.This study compared the Faster Region-based Convolutional Neural Networks Residual Network101(Faster R-CNN ResNet101)model,Cascade Regionbased Convolutional Neural Networks Residual Network101(Cascade RCNN-ResNet101)model,and Cascade R-CNN Swin Model.As a result,the Faster R-CNN ResNet101 model came out as Average Precision(AP)(Intersection over Union(IoU)=0.5):88.2%,AP(IoU=0.75):62.8%,Recall:68.2%,and the Cascade R-CNN ResNet101 model was AP(IoU=0.5):91.5%,AP(IoU=0.75):67.2%,Recall:73.1%.Alternatively,the Cascade R-CNN Swin Model showed AP(IoU=0.5):94.9%,AP(IoU=0.75):79.8%and Recall:76.5%.Thus,the Cascade R-CNN Swin Model showed the best results for detecting citrus greening disease.