The new coronavirus(COVID-19),declared by the World Health Organization as a pandemic,has infected more than 1 million people and killed more than 50 thousand.An infection caused by COVID-19 can develop into pneumonia...The new coronavirus(COVID-19),declared by the World Health Organization as a pandemic,has infected more than 1 million people and killed more than 50 thousand.An infection caused by COVID-19 can develop into pneumonia,which can be detected by a chest X-ray exam and should be treated appropriately.In this work,we propose an automatic detection method for COVID-19 infection based on chest X-ray images.The datasets constructed for this study are composed of194 X-ray images of patients diagnosed with coronavirus and 194 X-ray images of healthy patients.Since few images of patients with COVID-19 are publicly available,we apply the concept of transfer learning for this task.We use different architectures of convolutional neural networks(CNNs)trained on Image Net,and adapt them to behave as feature extractors for the X-ray images.Then,the CNNs are combined with consolidated machine learning methods,such as k-Nearest Neighbor,Bayes,Random Forest,multilayer perceptron(MLP),and support vector machine(SVM).The results show that,for one of the datasets,the extractor-classifier pair with the best performance is the Mobile Net architecture with the SVM classifier using a linear kernel,which achieves an accuracy and an F1-score of 98.5%.For the other dataset,the best pair is Dense Net201 with MLP,achieving an accuracy and an F1-score of 95.6%.Thus,the proposed approach demonstrates efficiency in detecting COVID-19 in X-ray images.展开更多
In the last decade,market financial forecasting has attracted high interests amongst the researchers in pattern recognition.Usually,the data used for analysing the market,and then gamble on its future trend,are provid...In the last decade,market financial forecasting has attracted high interests amongst the researchers in pattern recognition.Usually,the data used for analysing the market,and then gamble on its future trend,are provided as time series;this aspect,along with the high fluctuation of this kind of data,cuts out the use of very efficient classification tools,very popular in the state of the art,like the well known convolutional neural networks(CNNs)models such as Inception,Res Net,Alex Net,and so on.This forces the researchers to train new tools from scratch.Such operations could be very time consuming.This paper exploits an ensemble of CNNs,trained over Gramian angular fields(GAF)images,generated from time series related to the Standard&Poor's 500 index future;the aim is the prediction of the future trend of the U.S.market.A multi-resolution imaging approach is used to feed each CNN,enabling the analysis of different time intervals for a single observation.A simple trading system based on the ensemble forecaster is used to evaluate the quality of the proposed approach.Our method outperforms the buyand-hold(B&H)strategy in a time frame where the latter provides excellent returns.Both quantitative and qualitative results are provided.展开更多
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info...Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.展开更多
Graph neural networks have been shown to be very effective in utilizing pairwise relationships across samples.Recently,there have been several successful proposals to generalize graph neural networks to hypergraph neu...Graph neural networks have been shown to be very effective in utilizing pairwise relationships across samples.Recently,there have been several successful proposals to generalize graph neural networks to hypergraph neural networks to exploit more com-plex relationships.In particular,the hypergraph collaborative networks yield superior results compared to other hypergraph neural net-works for various semi-supervised learning tasks.The collaborative network can provide high quality vertex embeddings and hyperedge embeddings together by formulating them as a joint optimization problem and by using their consistency in reconstructing the given hy-pergraph.In this paper,we aim to establish the algorithmic stability of the core layer of the collaborative network and provide generaliz--ation guarantees.The analysis sheds light on the design of hypergraph filters in collaborative networks,for instance,how the data and hypergraph filters should be scaled to achieve uniform stability of the learning process.Some experimental results on real-world datasets are presented to illustrate the theory.展开更多
Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have b...Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have been proposed,most of them can only address part of the practical difficulties.An oscillation is heuristically defined as a visually apparent periodic variation.However,manual visual inspection is labor-intensive and prone to missed detection.Convolutional neural networks(CNNs),inspired by animal visual systems,have been raised with powerful feature extraction capabilities.In this work,an exploration of the typical CNN models for visual oscillation detection is performed.Specifically,we tested MobileNet-V1,ShuffleNet-V2,Efficient Net-B0,and GhostNet models,and found that such a visual framework is well-suited for oscillation detection.The feasibility and validity of this framework are verified utilizing extensive numerical and industrial cases.Compared with state-of-theart oscillation detectors,the suggested framework is more straightforward and more robust to noise and mean-nonstationarity.In addition,this framework generalizes well and is capable of handling features that are not present in the training data,such as multiple oscillations and outliers.展开更多
This study assesses the suitability of convolutional neural networks(CNNs) for downscaling precipitation over East Africa in the context of seasonal forecasting. To achieve this, we design a set of experiments that co...This study assesses the suitability of convolutional neural networks(CNNs) for downscaling precipitation over East Africa in the context of seasonal forecasting. To achieve this, we design a set of experiments that compare different CNN configurations and deployed the best-performing architecture to downscale one-month lead seasonal forecasts of June–July–August–September(JJAS) precipitation from the Nanjing University of Information Science and Technology Climate Forecast System version 1.0(NUIST-CFS1.0) for 1982–2020. We also perform hyper-parameter optimization and introduce predictors over a larger area to include information about the main large-scale circulations that drive precipitation over the East Africa region, which improves the downscaling results. Finally, we validate the raw model and downscaled forecasts in terms of both deterministic and probabilistic verification metrics, as well as their ability to reproduce the observed precipitation extreme and spell indicator indices. The results show that the CNN-based downscaling consistently improves the raw model forecasts, with lower bias and more accurate representations of the observed mean and extreme precipitation spatial patterns. Besides, CNN-based downscaling yields a much more accurate forecast of extreme and spell indicators and reduces the significant relative biases exhibited by the raw model predictions. Moreover, our results show that CNN-based downscaling yields better skill scores than the raw model forecasts over most portions of East Africa. The results demonstrate the potential usefulness of CNN in downscaling seasonal precipitation predictions over East Africa,particularly in providing improved forecast products which are essential for end users.展开更多
Although convolutional neural network(CNN)paradigms have expanded to transfer learning and ensemble models from original individual CNN architectures,few studies have focused on the performance comparison of the appli...Although convolutional neural network(CNN)paradigms have expanded to transfer learning and ensemble models from original individual CNN architectures,few studies have focused on the performance comparison of the applicability of these techniques in detecting and localizing rice diseases.Moreover,most CNN-based rice disease detection studies only considered a small number of diseases in their experiments.Both these shortcomings were addressed in this study.In this study,a rice disease classification comparison of six CNN-based deep-learning architectures(DenseNet121,Inceptionv3,MobileNetV2,resNext101,Resnet152V,and Seresnext101)was conducted using a database of nine of the most epidemic rice diseases in Bangladesh.In addition,we applied a transfer learning approach to DenseNet121,MobileNetV2,Resnet152V,Seresnext101,and an ensemble model called DEX(Densenet121,EfficientNetB7,and Xception)to compare the six individual CNN networks,transfer learning,and ensemble techniques.The results suggest that the ensemble framework provides the best accuracy of 98%,and transfer learning can increase the accuracy by 17%from the results obtained by Seresnext101 in detecting and localizing rice leaf diseases.The high accuracy in detecting and categorisation rice leaf diseases using CNN suggests that the deep CNN model is promising in the plant disease detection domain and can significantly impact the detection of diseases in real-time agricultural systems.This research is significant for farmers in rice-growing countries,as like many other plant diseases,rice diseases require timely and early identification of infected diseases and this research develops a rice leaf detection system based on CNN that is expected to help farmers to make fast decisions to protect their agricultural yields and quality.展开更多
Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to ...Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to obtain.To tackle this problem,we make an early attempt to achieve video object segmentation with scribble-level supervision,which can alleviate large amounts of human labor for collecting the manual annotation.However,using conventional network architectures and learning objective functions under this scenario cannot work well as the supervision information is highly sparse and incomplete.To address this issue,this paper introduces two novel elements to learn the video object segmentation model.The first one is the scribble attention module,which captures more accurate context information and learns an effective attention map to enhance the contrast between foreground and background.The other one is the scribble-supervised loss,which can optimize the unlabeled pixels and dynamically correct inaccurate segmented areas during the training stage.To evaluate the proposed method,we implement experiments on two video object segmentation benchmark datasets,You Tube-video object segmentation(VOS),and densely annotated video segmentation(DAVIS)-2017.We first generate the scribble annotations from the original per-pixel annotations.Then,we train our model and compare its test performance with the baseline models and other existing works.Extensive experiments demonstrate that the proposed method can work effectively and approach to the methods requiring the dense per-pixel annotations.展开更多
Tuberculosis(TB)is a severe infection that mostly affects the lungs and kills millions of people’s lives every year.Tuberculosis can be diagnosed using chest X-rays(CXR)and data-driven deep learning(DL)approaches.Bec...Tuberculosis(TB)is a severe infection that mostly affects the lungs and kills millions of people’s lives every year.Tuberculosis can be diagnosed using chest X-rays(CXR)and data-driven deep learning(DL)approaches.Because of its better automated feature extraction capability,convolutional neural net-works(CNNs)trained on natural images are particularly effective in image cate-gorization.A combination of 3001 normal and 3001 TB CXR images was gathered for this study from different accessible public datasets.Ten different deep CNNs(Resnet50,Resnet101,Resnet152,InceptionV3,VGG16,VGG19,DenseNet121,DenseNet169,DenseNet201,MobileNet)are trained and tested for identifying TB and normal cases.This study presents a deep CNN approach based on histogram matched CXR images that does not require object segmenta-tion of interest,and this coupled methodology of histogram matching with the CXRs improves the accuracy and detection performance of CNN models for TB detection.Furthermore,this research contains two separate experiments that used CXR images with and without histogram matching to classify TB and non-TB CXRs using deep CNNs.It was able to accurately detect TB from CXR images using pre-processing,data augmentation,and deep CNN models.Without histogram matching the best accuracy,sensitivity,specificity,precision and F1-score in the detection of TB using CXR images among ten models are 99.25%,99.48%,99.52%,99.48%and 99.22%respectively.With histogram matching the best accuracy,sensitivity,specificity,precision and F1-score are 99.58%,99.82%,99.67%,99.65%and 99.56%respectively.The proposed meth-odology,which has cutting-edge performance,will be useful in computer-assisted TB diagnosis and aids in minimizing irregularities in TB detection in developing countries.展开更多
Humongous amounts of data bring various challenges to face image retrieval. This paper proposes an efficient method to solve those problems. Firstly,we use accurate facial landmark locations as shape features. Secondl...Humongous amounts of data bring various challenges to face image retrieval. This paper proposes an efficient method to solve those problems. Firstly,we use accurate facial landmark locations as shape features. Secondly, we utilise shape priors to provide discriminative texture features for convolutional neural networks. These shape and texture features are fused to make the learned representation more robust.Finally, in order to increase efficiency, a coarse-tofine search mechanism is exploited to efficiently find similar objects. Extensive experiments on the CASIAWeb Face, MSRA-CFW, and LFW datasets illustrate the superiority of our method.展开更多
Reducing the defocus blur that arises from the finite aperture size and short exposure time is an essential problem in computational photography.It is very challenging because the blur kernel is spatially varying and ...Reducing the defocus blur that arises from the finite aperture size and short exposure time is an essential problem in computational photography.It is very challenging because the blur kernel is spatially varying and difficult to estimate by traditional methods.Due to its great breakthrough in low-level tasks,convolutional neural networks(CNNs)have been introdu-ced to the defocus deblurring problem and achieved significant progress.However,previous methods apply the same learned kernel for different regions of the defocus blurred images,thus it is difficult to handle nonuniform blurred images.To this end,this study designs a novel blur-aware multi-branch network(Ba-MBNet),in which different regions are treated differentially.In particular,we estimate the blur amounts of different regions by the internal geometric constraint of the dual-pixel(DP)data,which measures the defocus disparity between the left and right views.Based on the assumption that different image regions with different blur amounts have different deblurring difficulties,we leverage different networks with different capacities to treat different image regions.Moreover,we introduce a meta-learning defocus mask generation algorithm to assign each pixel to a proper branch.In this way,we can expect to maintain the information of the clear regions well while recovering the missing details of the blurred regions.Both quantitative and qualitative experiments demonstrate that our BaMBNet outperforms the state-of-the-art(SOTA)methods.For the dual-pixel defocus deblurring(DPD)-blur dataset,the proposed BaMBNet achieves 1.20 dB gain over the previous SOTA method in term of peak signal-to-noise ratio(PSNR)and reduces learnable parameters by 85%.The details of the code and dataset are available at https://github.com/junjun-jiang/BaMBNet.展开更多
With the rapid development of Web3 D technologies, sketch-based model retrieval has become an increasingly important challenge, while the application of Virtual Reality and 3 D technologies has made shape retrieval of...With the rapid development of Web3 D technologies, sketch-based model retrieval has become an increasingly important challenge, while the application of Virtual Reality and 3 D technologies has made shape retrieval of furniture over a web browser feasible. In this paper, we propose a learning framework for shape retrieval based on two Siamese VGG-16 Convolutional Neural Networks(CNNs), and a CNN-based hybrid learning algorithm to select the best view for a shape. In this algorithm, the AlexNet and VGG-16 CNN architectures are used to perform classification tasks and to extract features, respectively. In addition, a feature fusion method is used to measure the similarity relation of the output features from the two Siamese networks. The proposed framework can provide new alternatives for furniture retrieval in the Web3 D environment. The primary innovation is in the employment of deep learning methods to solve the challenge of obtaining the best view of 3 D furniture,and to address cross-domain feature learning problems. We conduct an experiment to verify the feasibility of the framework and the results show our approach to be superior in comparison to many mainstream state-of-the-art approaches.展开更多
One of the technical bottlenecks of traditional laser-induced breakdown spectroscopy(LIBS) is the difficulty in quantitative detection caused by the matrix effect. To troubleshoot this problem,this paper investigated ...One of the technical bottlenecks of traditional laser-induced breakdown spectroscopy(LIBS) is the difficulty in quantitative detection caused by the matrix effect. To troubleshoot this problem,this paper investigated a combination of time-resolved LIBS and convolutional neural networks(CNNs) to improve K determination in soil. The time-resolved LIBS contained the information of both wavelength and time dimension. The spectra of wavelength dimension showed the characteristic emission lines of elements, and those of time dimension presented the plasma decay trend. The one-dimensional data of LIBS intensity from the emission line at 766.49 nm were extracted and correlated with the K concentration, showing a poor correlation of R_c^2?=?0.0967, which is caused by the matrix effect of heterogeneous soil. For the wavelength dimension, the two-dimensional data of traditional integrated LIBS were extracted and analyzed by an artificial neural network(ANN), showing R_v^2?=?0.6318 and the root mean square error of validation(RMSEV)?=?0.6234. For the time dimension, the two-dimensional data of time-decay LIBS were extracted and analyzed by ANN, showing R_v^2?=?0.7366 and RMSEV?=?0.7855.These higher determination coefficients reveal that both the non-K emission lines of wavelength dimension and the spectral decay of time dimension could assist in quantitative detection of K.However, due to limited calibration samples, the two-dimensional models presented over-fitting.The three-dimensional data of time-resolved LIBS were analyzed by CNNs, which extracted and integrated the information of both the wavelength and time dimension, showing the R_v^2?=?0.9968 and RMSEV?=?0.0785. CNN analysis of time-resolved LIBS is capable of improving the determination of K in soil.展开更多
Both time-delays and anti-windup(AW)problems are conventional problems in system design,which are scarcely discussed in cellular neural networks(CNNs).This paper discusses stabilization for a class of distributed time...Both time-delays and anti-windup(AW)problems are conventional problems in system design,which are scarcely discussed in cellular neural networks(CNNs).This paper discusses stabilization for a class of distributed time-delayed CNNs with input saturation.Based on the Lyapunov theory and the Schur complement principle,a bilinear matrix inequality(BMI)criterion is designed to stabilize the system with input saturation.By matrix congruent transformation,the BMI control criterion can be changed into linear matrix inequality(LMI)criterion,then it can be easily solved by the computer.It is a one-step AW strategy that the feedback compensator and the AW compensator can be determined simultaneously.The attraction domain and its optimization are also discussed.The structure of CNNs with both constant timedelays and distribute time-delays is more general.This method is simple and systematic,allowing dealing with a large class of such systems whose excitation satisfies the Lipschitz condition.The simulation results verify the effectiveness and feasibility of the proposed method.展开更多
Non-orthogonal multiple access(NOMA), featuring high spectrum efficiency, massive connectivity and low latency, holds immense potential to be a novel multi-access technique in fifth-generation(5G) communication. Succe...Non-orthogonal multiple access(NOMA), featuring high spectrum efficiency, massive connectivity and low latency, holds immense potential to be a novel multi-access technique in fifth-generation(5G) communication. Successive interference cancellation(SIC) is proved to be an effective method to detect the NOMA signal by ordering the power of received signals and then decoding them. However, the error accumulation effect referred to as error propagation is an inevitable problem. In this paper,we propose a convolutional neural networks(CNNs) approach to restore the desired signal impaired by the multiple input multiple output(MIMO) channel. Especially in the uplink NOMA scenario,the proposed method can decode multiple users' information in a cluster instantaneously without any traditional communication signal processing steps. Simulation experiments are conducted in the Rayleigh channel and the results demonstrate that the error performance of the proposed learning system outperforms that of the classic SIC detection. Consequently, deep learning has disruptive potential to replace the conventional signal detection method.展开更多
Cancer is one of the most critical diseases that has caused several deaths in today’s world.In most cases,doctors and practitioners are only able to diagnose cancer in its later stages.In the later stages,planning ca...Cancer is one of the most critical diseases that has caused several deaths in today’s world.In most cases,doctors and practitioners are only able to diagnose cancer in its later stages.In the later stages,planning cancer treatment and increasing the patient’s survival rate becomes a very challenging task.Therefore,it becomes the need of the hour to detect cancer in the early stages for appropriate treatment and surgery planning.Analysis and interpretation of medical images such as MRI and CT scans help doctors and practitioners diagnose many diseases,including cancer disease.However,manual interpretation of medical images is costly,time-consuming and biased.Nowadays,deep learning,a subset of artificial intelligence,is gaining increasing attention from practitioners in automatically analysing and interpreting medical images without their intervention.Deep learning methods have reported extraordinary results in different fields due to their ability to automatically extract intrinsic features from images without any dependence on manually extracted features.This study provides a comprehensive review of deep learning methods in cancer detection and diagnosis,mainly focusing on breast cancer,brain cancer,skin cancer,and prostate cancer.This study describes various deep learningmodels and steps for applying deep learningmodels in detecting cancer.Recent developments in cancer detection based on deep learning methods have been critically analysed and summarised to identify critical challenges in applying them for detecting cancer accurately in the early stages.Based on the identified challenges,we provide a few promising future research directions for fellow researchers in the field.The outcome of this study provides many clues for developing practical and accurate cancer detection systems for its early diagnosis and treatment planning.展开更多
A demodulator based on convolutional neural networks( CNNs) is proposed to demodulate bipolar extended binary phase shifting keying( EBPSK) signals transmitted at a faster-thanNyquist( FTN) rate, solving the pro...A demodulator based on convolutional neural networks( CNNs) is proposed to demodulate bipolar extended binary phase shifting keying( EBPSK) signals transmitted at a faster-thanNyquist( FTN) rate, solving the problem of severe inter symbol interference( ISI) caused by FTN rate signals. With the characteristics of local connectivity, pooling and weight sharing,a six-layer CNNs structure is used to demodulate and eliminate ISI. The results showthat with the symbol rate of 1. 07 k Bd, the bandwidth of the band-pass filter( BPF) in a transmitter of 1 k Hz and the changing number of carrier cycles in a symbol K = 5,10,15,28, the overall bit error ratio( BER) performance of CNNs with single-symbol decision is superior to that with a doublesymbol united-decision. In addition, the BER performance of single-symbol decision is approximately 0. 5 d B better than that of the coherent demodulator while K equals the total number of carrier circles in a symbol, i. e., K = N = 28. With the symbol rate of 1. 07 k Bd, the bandwidth of BPF in a transmitter of 500 Hz and K = 5,10,15,28, the overall BER performance of CNNs with double-symbol united-decision is superior to those with single-symbol decision. Moreover, the double-symbol uniteddecision method is approximately 0. 5 to 1. 5 d B better than that of the coherent demodulator while K = N = 28. The demodulators based on CNNs successfully solve the serious ISI problems generated during the transmission of FTN rate bipolar EBPSK signals, which is beneficial for the improvement of spectrum efficiency.展开更多
Although Convolutional Neural Networks(CNNs)have significantly improved the development of image Super-Resolution(SR)technology in recent years,the existing SR methods for SAR image with large scale factors have rarel...Although Convolutional Neural Networks(CNNs)have significantly improved the development of image Super-Resolution(SR)technology in recent years,the existing SR methods for SAR image with large scale factors have rarely been studied due to technical difficulty.A more efficient method is to obtain comprehensive information to guide the SAR image reconstruction.Indeed,the co-registered High-Resolution(HR)optical image has been successfully applied to enhance the quality of SAR image due to its discriminative characteristics.Inspired by this,we propose a novel Optical-Guided Super-Resolution Network(OGSRN)for SAR image with large scale factors.Specifically,our proposed OGSRN consists of two sub-nets:a SAR image SuperResolution U-Net(SRUN)and a SAR-to-Optical Residual Translation Network(SORTN).The whole process during training includes two stages.In stage-1,the SR SAR images are reconstructed by the SRUN.And an Enhanced Residual Attention Module(ERAM),which is comprised of the Channel Attention(CA)and Spatial Attention(SA)mechanisms,is constructed to boost the representation ability of the network.In stage-2,the output of the stage-1 and its corresponding HR SAR images are translated to optical images by the SORTN,respectively.And then the differences between SR images and HR images are computed in the optical space to obtain feedback information that can reduce the space of possible SR solution.After that,we can use the optimized SRUN to directly produce HR SAR image from Low-Resolution(LR)SAR image in the testing phase.The experimental results show that under the guidance of optical image,our OGSRN can achieve excellent performance in both quantitative assessment metrics and visual quality.展开更多
The convolution operation possesses the characteristic of translation group equivariance. To achieve more group equivariances, rotation group equivariant convolutions(RGEC) are proposed to acquire both translation and...The convolution operation possesses the characteristic of translation group equivariance. To achieve more group equivariances, rotation group equivariant convolutions(RGEC) are proposed to acquire both translation and rotation group equivariances.However, previous work paid more attention to the number of parameters and usually ignored other resource costs. In this paper, we construct our networks without introducing extra resource costs. Specifically, a convolution kernel is rotated to different orientations for feature extractions of multiple channels. Meanwhile, much fewer kernels than previous works are used to ensure that the output channel does not increase. To further enhance the orthogonality of kernels in different orientations, we construct the non-maximum-suppression loss on the rotation dimension to suppress the other directions except the most activated one. Considering that the low-level-features benefit more from the rotational symmetry, we only share weights in the shallow layers(SWSL) via RGEC. Extensive experiments on multiple datasets(i.e., Image Net, CIFAR, and MNIST) demonstrate that SWSL can effectively benefit from the higher-degree weight sharing and improve the performances of various networks, including plain and Res Net architectures. Meanwhile, the convolutional kernels and parameters are much fewer(e.g., 75%, 87.5% fewer) in the shallow layers, and no extra computation costs are introduced.展开更多
Computational prediction of in-hospital mortality in the setting of an intensive care unit can help clinical practitioners to guide care and make early decisions for interventions. As clinical data are complex and var...Computational prediction of in-hospital mortality in the setting of an intensive care unit can help clinical practitioners to guide care and make early decisions for interventions. As clinical data are complex and varied in their structure and components, continued innovation of modelling strategies is required to identify architectures that can best model outcomes. In this work, we trained a Heterogeneous Graph Model(HGM) on electronic health record(EHR) data and used the resulting embedding vector as additional information added to a Convolutional Neural Network(CNN) model for predicting in-hospital mortality. We show that the additional information provided by including time as a vector in the embedding captured the relationships between medical concepts, lab tests, and diagnoses, which enhanced predictive performance. We found that adding HGM to a CNN model increased the mortality prediction accuracy up to 4%. This framework served as a foundation for future experiments involving different EHR data types on important healthcare prediction tasks.展开更多
基金supported in part by the Coordenacao de Aperfeicoamento de Pessoal de Nível Superior-Brasil(CAPES)(001)the Brazilian National Council for Research and Development(CNPq)(431709/2018-1,311973/2018-3,304315/2017-6,430274/2018-1)。
文摘The new coronavirus(COVID-19),declared by the World Health Organization as a pandemic,has infected more than 1 million people and killed more than 50 thousand.An infection caused by COVID-19 can develop into pneumonia,which can be detected by a chest X-ray exam and should be treated appropriately.In this work,we propose an automatic detection method for COVID-19 infection based on chest X-ray images.The datasets constructed for this study are composed of194 X-ray images of patients diagnosed with coronavirus and 194 X-ray images of healthy patients.Since few images of patients with COVID-19 are publicly available,we apply the concept of transfer learning for this task.We use different architectures of convolutional neural networks(CNNs)trained on Image Net,and adapt them to behave as feature extractors for the X-ray images.Then,the CNNs are combined with consolidated machine learning methods,such as k-Nearest Neighbor,Bayes,Random Forest,multilayer perceptron(MLP),and support vector machine(SVM).The results show that,for one of the datasets,the extractor-classifier pair with the best performance is the Mobile Net architecture with the SVM classifier using a linear kernel,which achieves an accuracy and an F1-score of 98.5%.For the other dataset,the best pair is Dense Net201 with MLP,achieving an accuracy and an F1-score of 95.6%.Thus,the proposed approach demonstrates efficiency in detecting COVID-19 in X-ray images.
基金supported by the“Bando Aiuti per progetti di Ricerca e Sviluppo-POR FESR 2014-2020-Asse 1,Azione 1.1.3.Project AlmostAnOracle-AI and Big Data Algorithms for Financial Time Series Forecasting”。
文摘In the last decade,market financial forecasting has attracted high interests amongst the researchers in pattern recognition.Usually,the data used for analysing the market,and then gamble on its future trend,are provided as time series;this aspect,along with the high fluctuation of this kind of data,cuts out the use of very efficient classification tools,very popular in the state of the art,like the well known convolutional neural networks(CNNs)models such as Inception,Res Net,Alex Net,and so on.This forces the researchers to train new tools from scratch.Such operations could be very time consuming.This paper exploits an ensemble of CNNs,trained over Gramian angular fields(GAF)images,generated from time series related to the Standard&Poor's 500 index future;the aim is the prediction of the future trend of the U.S.market.A multi-resolution imaging approach is used to feed each CNN,enabling the analysis of different time intervals for a single observation.A simple trading system based on the ensemble forecaster is used to evaluate the quality of the proposed approach.Our method outperforms the buyand-hold(B&H)strategy in a time frame where the latter provides excellent returns.Both quantitative and qualitative results are provided.
基金supported by the German National BMBF IKT2020-Grant(16SV7213)(EmotAsS)the European-Unions Horizon 2020 Research and Innovation Programme(688835)(DE-ENIGMA)the China Scholarship Council(CSC)
文摘Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.
基金Ng was supported in part by Hong Kong Research Grant Council General Research Fund(GRF),China(Nos.12300218,12300519,117201020,17300021,CRF C1013-21GF,C7004-21GF and Joint NSFC-RGC NHKU76921)Wu is supported by National Natural Science Foundation of China(No.62206111)+3 种基金Young Talent Support Project of Guangzhou Association for Science and Technology,China(No.QT-2023-017)Guangzhou Basic and Applied Basic Research Foundation,China(No.2023A04J1058)Fundamental Research Funds for the Central Universities,China(No.21622326)China Postdoctoral Science Foundation(No.2022M721343).
文摘Graph neural networks have been shown to be very effective in utilizing pairwise relationships across samples.Recently,there have been several successful proposals to generalize graph neural networks to hypergraph neural networks to exploit more com-plex relationships.In particular,the hypergraph collaborative networks yield superior results compared to other hypergraph neural net-works for various semi-supervised learning tasks.The collaborative network can provide high quality vertex embeddings and hyperedge embeddings together by formulating them as a joint optimization problem and by using their consistency in reconstructing the given hy-pergraph.In this paper,we aim to establish the algorithmic stability of the core layer of the collaborative network and provide generaliz--ation guarantees.The analysis sheds light on the design of hypergraph filters in collaborative networks,for instance,how the data and hypergraph filters should be scaled to achieve uniform stability of the learning process.Some experimental results on real-world datasets are presented to illustrate the theory.
基金the National Natural Science Foundation of China(62003298,62163036)the Major Project of Science and Technology of Yunnan Province(202202AD080005,202202AH080009)the Yunnan University Professional Degree Graduate Practice Innovation Fund Project(ZC-22222770)。
文摘Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have been proposed,most of them can only address part of the practical difficulties.An oscillation is heuristically defined as a visually apparent periodic variation.However,manual visual inspection is labor-intensive and prone to missed detection.Convolutional neural networks(CNNs),inspired by animal visual systems,have been raised with powerful feature extraction capabilities.In this work,an exploration of the typical CNN models for visual oscillation detection is performed.Specifically,we tested MobileNet-V1,ShuffleNet-V2,Efficient Net-B0,and GhostNet models,and found that such a visual framework is well-suited for oscillation detection.The feasibility and validity of this framework are verified utilizing extensive numerical and industrial cases.Compared with state-of-theart oscillation detectors,the suggested framework is more straightforward and more robust to noise and mean-nonstationarity.In addition,this framework generalizes well and is capable of handling features that are not present in the training data,such as multiple oscillations and outliers.
基金supported by the National Key Research and Development Program of China (Grant No.2020YFA0608000)the National Natural Science Foundation of China (Grant No. 42030605)the High-Performance Computing of Nanjing University of Information Science&Technology for their support of this work。
文摘This study assesses the suitability of convolutional neural networks(CNNs) for downscaling precipitation over East Africa in the context of seasonal forecasting. To achieve this, we design a set of experiments that compare different CNN configurations and deployed the best-performing architecture to downscale one-month lead seasonal forecasts of June–July–August–September(JJAS) precipitation from the Nanjing University of Information Science and Technology Climate Forecast System version 1.0(NUIST-CFS1.0) for 1982–2020. We also perform hyper-parameter optimization and introduce predictors over a larger area to include information about the main large-scale circulations that drive precipitation over the East Africa region, which improves the downscaling results. Finally, we validate the raw model and downscaled forecasts in terms of both deterministic and probabilistic verification metrics, as well as their ability to reproduce the observed precipitation extreme and spell indicator indices. The results show that the CNN-based downscaling consistently improves the raw model forecasts, with lower bias and more accurate representations of the observed mean and extreme precipitation spatial patterns. Besides, CNN-based downscaling yields a much more accurate forecast of extreme and spell indicators and reduces the significant relative biases exhibited by the raw model predictions. Moreover, our results show that CNN-based downscaling yields better skill scores than the raw model forecasts over most portions of East Africa. The results demonstrate the potential usefulness of CNN in downscaling seasonal precipitation predictions over East Africa,particularly in providing improved forecast products which are essential for end users.
文摘Although convolutional neural network(CNN)paradigms have expanded to transfer learning and ensemble models from original individual CNN architectures,few studies have focused on the performance comparison of the applicability of these techniques in detecting and localizing rice diseases.Moreover,most CNN-based rice disease detection studies only considered a small number of diseases in their experiments.Both these shortcomings were addressed in this study.In this study,a rice disease classification comparison of six CNN-based deep-learning architectures(DenseNet121,Inceptionv3,MobileNetV2,resNext101,Resnet152V,and Seresnext101)was conducted using a database of nine of the most epidemic rice diseases in Bangladesh.In addition,we applied a transfer learning approach to DenseNet121,MobileNetV2,Resnet152V,Seresnext101,and an ensemble model called DEX(Densenet121,EfficientNetB7,and Xception)to compare the six individual CNN networks,transfer learning,and ensemble techniques.The results suggest that the ensemble framework provides the best accuracy of 98%,and transfer learning can increase the accuracy by 17%from the results obtained by Seresnext101 in detecting and localizing rice leaf diseases.The high accuracy in detecting and categorisation rice leaf diseases using CNN suggests that the deep CNN model is promising in the plant disease detection domain and can significantly impact the detection of diseases in real-time agricultural systems.This research is significant for farmers in rice-growing countries,as like many other plant diseases,rice diseases require timely and early identification of infected diseases and this research develops a rice leaf detection system based on CNN that is expected to help farmers to make fast decisions to protect their agricultural yields and quality.
基金supported in part by the National Key R&D Program of China(2017YFB0502904)the National Science Foundation of China(61876140)。
文摘Recently,video object segmentation has received great attention in the computer vision community.Most of the existing methods heavily rely on the pixel-wise human annotations,which are expensive and time-consuming to obtain.To tackle this problem,we make an early attempt to achieve video object segmentation with scribble-level supervision,which can alleviate large amounts of human labor for collecting the manual annotation.However,using conventional network architectures and learning objective functions under this scenario cannot work well as the supervision information is highly sparse and incomplete.To address this issue,this paper introduces two novel elements to learn the video object segmentation model.The first one is the scribble attention module,which captures more accurate context information and learns an effective attention map to enhance the contrast between foreground and background.The other one is the scribble-supervised loss,which can optimize the unlabeled pixels and dynamically correct inaccurate segmented areas during the training stage.To evaluate the proposed method,we implement experiments on two video object segmentation benchmark datasets,You Tube-video object segmentation(VOS),and densely annotated video segmentation(DAVIS)-2017.We first generate the scribble annotations from the original per-pixel annotations.Then,we train our model and compare its test performance with the baseline models and other existing works.Extensive experiments demonstrate that the proposed method can work effectively and approach to the methods requiring the dense per-pixel annotations.
文摘Tuberculosis(TB)is a severe infection that mostly affects the lungs and kills millions of people’s lives every year.Tuberculosis can be diagnosed using chest X-rays(CXR)and data-driven deep learning(DL)approaches.Because of its better automated feature extraction capability,convolutional neural net-works(CNNs)trained on natural images are particularly effective in image cate-gorization.A combination of 3001 normal and 3001 TB CXR images was gathered for this study from different accessible public datasets.Ten different deep CNNs(Resnet50,Resnet101,Resnet152,InceptionV3,VGG16,VGG19,DenseNet121,DenseNet169,DenseNet201,MobileNet)are trained and tested for identifying TB and normal cases.This study presents a deep CNN approach based on histogram matched CXR images that does not require object segmenta-tion of interest,and this coupled methodology of histogram matching with the CXRs improves the accuracy and detection performance of CNN models for TB detection.Furthermore,this research contains two separate experiments that used CXR images with and without histogram matching to classify TB and non-TB CXRs using deep CNNs.It was able to accurately detect TB from CXR images using pre-processing,data augmentation,and deep CNN models.Without histogram matching the best accuracy,sensitivity,specificity,precision and F1-score in the detection of TB using CXR images among ten models are 99.25%,99.48%,99.52%,99.48%and 99.22%respectively.With histogram matching the best accuracy,sensitivity,specificity,precision and F1-score are 99.58%,99.82%,99.67%,99.65%and 99.56%respectively.The proposed meth-odology,which has cutting-edge performance,will be useful in computer-assisted TB diagnosis and aids in minimizing irregularities in TB detection in developing countries.
文摘Humongous amounts of data bring various challenges to face image retrieval. This paper proposes an efficient method to solve those problems. Firstly,we use accurate facial landmark locations as shape features. Secondly, we utilise shape priors to provide discriminative texture features for convolutional neural networks. These shape and texture features are fused to make the learned representation more robust.Finally, in order to increase efficiency, a coarse-tofine search mechanism is exploited to efficiently find similar objects. Extensive experiments on the CASIAWeb Face, MSRA-CFW, and LFW datasets illustrate the superiority of our method.
基金supported by the National Natural Science Foundation of China (61971165, 61922027, 61773295)in part by the Fundamental Research Funds for the Central Universities (FRFCU5710050119)+1 种基金the Natural Science Foundation of Heilongjiang Province(YQ2020F004)the Chinese Association for Artificial Intelligence(CAAI)-Huawei Mind Spore Open Fund
文摘Reducing the defocus blur that arises from the finite aperture size and short exposure time is an essential problem in computational photography.It is very challenging because the blur kernel is spatially varying and difficult to estimate by traditional methods.Due to its great breakthrough in low-level tasks,convolutional neural networks(CNNs)have been introdu-ced to the defocus deblurring problem and achieved significant progress.However,previous methods apply the same learned kernel for different regions of the defocus blurred images,thus it is difficult to handle nonuniform blurred images.To this end,this study designs a novel blur-aware multi-branch network(Ba-MBNet),in which different regions are treated differentially.In particular,we estimate the blur amounts of different regions by the internal geometric constraint of the dual-pixel(DP)data,which measures the defocus disparity between the left and right views.Based on the assumption that different image regions with different blur amounts have different deblurring difficulties,we leverage different networks with different capacities to treat different image regions.Moreover,we introduce a meta-learning defocus mask generation algorithm to assign each pixel to a proper branch.In this way,we can expect to maintain the information of the clear regions well while recovering the missing details of the blurred regions.Both quantitative and qualitative experiments demonstrate that our BaMBNet outperforms the state-of-the-art(SOTA)methods.For the dual-pixel defocus deblurring(DPD)-blur dataset,the proposed BaMBNet achieves 1.20 dB gain over the previous SOTA method in term of peak signal-to-noise ratio(PSNR)and reduces learnable parameters by 85%.The details of the code and dataset are available at https://github.com/junjun-jiang/BaMBNet.
基金supported in part by the Fundamental Research Funds for the Central Universities in China (No. 2100219066)the Key Fundamental Research Funds for the Central Universities in China (No. 0200219153)
文摘With the rapid development of Web3 D technologies, sketch-based model retrieval has become an increasingly important challenge, while the application of Virtual Reality and 3 D technologies has made shape retrieval of furniture over a web browser feasible. In this paper, we propose a learning framework for shape retrieval based on two Siamese VGG-16 Convolutional Neural Networks(CNNs), and a CNN-based hybrid learning algorithm to select the best view for a shape. In this algorithm, the AlexNet and VGG-16 CNN architectures are used to perform classification tasks and to extract features, respectively. In addition, a feature fusion method is used to measure the similarity relation of the output features from the two Siamese networks. The proposed framework can provide new alternatives for furniture retrieval in the Web3 D environment. The primary innovation is in the employment of deep learning methods to solve the challenge of obtaining the best view of 3 D furniture,and to address cross-domain feature learning problems. We conduct an experiment to verify the feasibility of the framework and the results show our approach to be superior in comparison to many mainstream state-of-the-art approaches.
基金supported by National Natural Science Foundation of China (Grant No. 61505253)National Key Research and Development Plan of China (Project No. 2016YFD0200601)
文摘One of the technical bottlenecks of traditional laser-induced breakdown spectroscopy(LIBS) is the difficulty in quantitative detection caused by the matrix effect. To troubleshoot this problem,this paper investigated a combination of time-resolved LIBS and convolutional neural networks(CNNs) to improve K determination in soil. The time-resolved LIBS contained the information of both wavelength and time dimension. The spectra of wavelength dimension showed the characteristic emission lines of elements, and those of time dimension presented the plasma decay trend. The one-dimensional data of LIBS intensity from the emission line at 766.49 nm were extracted and correlated with the K concentration, showing a poor correlation of R_c^2?=?0.0967, which is caused by the matrix effect of heterogeneous soil. For the wavelength dimension, the two-dimensional data of traditional integrated LIBS were extracted and analyzed by an artificial neural network(ANN), showing R_v^2?=?0.6318 and the root mean square error of validation(RMSEV)?=?0.6234. For the time dimension, the two-dimensional data of time-decay LIBS were extracted and analyzed by ANN, showing R_v^2?=?0.7366 and RMSEV?=?0.7855.These higher determination coefficients reveal that both the non-K emission lines of wavelength dimension and the spectral decay of time dimension could assist in quantitative detection of K.However, due to limited calibration samples, the two-dimensional models presented over-fitting.The three-dimensional data of time-resolved LIBS were analyzed by CNNs, which extracted and integrated the information of both the wavelength and time dimension, showing the R_v^2?=?0.9968 and RMSEV?=?0.0785. CNN analysis of time-resolved LIBS is capable of improving the determination of K in soil.
基金supported by the National Natural Science Foundation of China(61374003 41631072)the Academic Foundation of Naval University of Engineering(20161475)
文摘Both time-delays and anti-windup(AW)problems are conventional problems in system design,which are scarcely discussed in cellular neural networks(CNNs).This paper discusses stabilization for a class of distributed time-delayed CNNs with input saturation.Based on the Lyapunov theory and the Schur complement principle,a bilinear matrix inequality(BMI)criterion is designed to stabilize the system with input saturation.By matrix congruent transformation,the BMI control criterion can be changed into linear matrix inequality(LMI)criterion,then it can be easily solved by the computer.It is a one-step AW strategy that the feedback compensator and the AW compensator can be determined simultaneously.The attraction domain and its optimization are also discussed.The structure of CNNs with both constant timedelays and distribute time-delays is more general.This method is simple and systematic,allowing dealing with a large class of such systems whose excitation satisfies the Lipschitz condition.The simulation results verify the effectiveness and feasibility of the proposed method.
基金supported by the National Natural Science Foundation of China (61471021)。
文摘Non-orthogonal multiple access(NOMA), featuring high spectrum efficiency, massive connectivity and low latency, holds immense potential to be a novel multi-access technique in fifth-generation(5G) communication. Successive interference cancellation(SIC) is proved to be an effective method to detect the NOMA signal by ordering the power of received signals and then decoding them. However, the error accumulation effect referred to as error propagation is an inevitable problem. In this paper,we propose a convolutional neural networks(CNNs) approach to restore the desired signal impaired by the multiple input multiple output(MIMO) channel. Especially in the uplink NOMA scenario,the proposed method can decode multiple users' information in a cluster instantaneously without any traditional communication signal processing steps. Simulation experiments are conducted in the Rayleigh channel and the results demonstrate that the error performance of the proposed learning system outperforms that of the classic SIC detection. Consequently, deep learning has disruptive potential to replace the conventional signal detection method.
文摘Cancer is one of the most critical diseases that has caused several deaths in today’s world.In most cases,doctors and practitioners are only able to diagnose cancer in its later stages.In the later stages,planning cancer treatment and increasing the patient’s survival rate becomes a very challenging task.Therefore,it becomes the need of the hour to detect cancer in the early stages for appropriate treatment and surgery planning.Analysis and interpretation of medical images such as MRI and CT scans help doctors and practitioners diagnose many diseases,including cancer disease.However,manual interpretation of medical images is costly,time-consuming and biased.Nowadays,deep learning,a subset of artificial intelligence,is gaining increasing attention from practitioners in automatically analysing and interpreting medical images without their intervention.Deep learning methods have reported extraordinary results in different fields due to their ability to automatically extract intrinsic features from images without any dependence on manually extracted features.This study provides a comprehensive review of deep learning methods in cancer detection and diagnosis,mainly focusing on breast cancer,brain cancer,skin cancer,and prostate cancer.This study describes various deep learningmodels and steps for applying deep learningmodels in detecting cancer.Recent developments in cancer detection based on deep learning methods have been critically analysed and summarised to identify critical challenges in applying them for detecting cancer accurately in the early stages.Based on the identified challenges,we provide a few promising future research directions for fellow researchers in the field.The outcome of this study provides many clues for developing practical and accurate cancer detection systems for its early diagnosis and treatment planning.
基金The National Natural Science Foundation of China(No.6504000089)
文摘A demodulator based on convolutional neural networks( CNNs) is proposed to demodulate bipolar extended binary phase shifting keying( EBPSK) signals transmitted at a faster-thanNyquist( FTN) rate, solving the problem of severe inter symbol interference( ISI) caused by FTN rate signals. With the characteristics of local connectivity, pooling and weight sharing,a six-layer CNNs structure is used to demodulate and eliminate ISI. The results showthat with the symbol rate of 1. 07 k Bd, the bandwidth of the band-pass filter( BPF) in a transmitter of 1 k Hz and the changing number of carrier cycles in a symbol K = 5,10,15,28, the overall bit error ratio( BER) performance of CNNs with single-symbol decision is superior to that with a doublesymbol united-decision. In addition, the BER performance of single-symbol decision is approximately 0. 5 d B better than that of the coherent demodulator while K equals the total number of carrier circles in a symbol, i. e., K = N = 28. With the symbol rate of 1. 07 k Bd, the bandwidth of BPF in a transmitter of 500 Hz and K = 5,10,15,28, the overall BER performance of CNNs with double-symbol united-decision is superior to those with single-symbol decision. Moreover, the double-symbol uniteddecision method is approximately 0. 5 to 1. 5 d B better than that of the coherent demodulator while K = N = 28. The demodulators based on CNNs successfully solve the serious ISI problems generated during the transmission of FTN rate bipolar EBPSK signals, which is beneficial for the improvement of spectrum efficiency.
基金supported by the National Natural Science Foundation of China(Nos.61771319,62076165 and 61871154)the Natural Science Foundation of Guangdong Province,China(No.2019A1515011307)+1 种基金Shenzhen Science and Technology Project,China(Nos.JCYJ20180507182259896 and 20200826154022001)the other project(Nos.2020KCXTD004 and WDZC20195500201)。
文摘Although Convolutional Neural Networks(CNNs)have significantly improved the development of image Super-Resolution(SR)technology in recent years,the existing SR methods for SAR image with large scale factors have rarely been studied due to technical difficulty.A more efficient method is to obtain comprehensive information to guide the SAR image reconstruction.Indeed,the co-registered High-Resolution(HR)optical image has been successfully applied to enhance the quality of SAR image due to its discriminative characteristics.Inspired by this,we propose a novel Optical-Guided Super-Resolution Network(OGSRN)for SAR image with large scale factors.Specifically,our proposed OGSRN consists of two sub-nets:a SAR image SuperResolution U-Net(SRUN)and a SAR-to-Optical Residual Translation Network(SORTN).The whole process during training includes two stages.In stage-1,the SR SAR images are reconstructed by the SRUN.And an Enhanced Residual Attention Module(ERAM),which is comprised of the Channel Attention(CA)and Spatial Attention(SA)mechanisms,is constructed to boost the representation ability of the network.In stage-2,the output of the stage-1 and its corresponding HR SAR images are translated to optical images by the SORTN,respectively.And then the differences between SR images and HR images are computed in the optical space to obtain feedback information that can reduce the space of possible SR solution.After that,we can use the optimized SRUN to directly produce HR SAR image from Low-Resolution(LR)SAR image in the testing phase.The experimental results show that under the guidance of optical image,our OGSRN can achieve excellent performance in both quantitative assessment metrics and visual quality.
基金supported by National Natural Science Foundation of China(Nos.61976209 and 62020106015)CAS International Collaboration Key Project(No.173211KYSB20190024)Strategic Priority Research Program of CAS(No.XDB32040000)。
文摘The convolution operation possesses the characteristic of translation group equivariance. To achieve more group equivariances, rotation group equivariant convolutions(RGEC) are proposed to acquire both translation and rotation group equivariances.However, previous work paid more attention to the number of parameters and usually ignored other resource costs. In this paper, we construct our networks without introducing extra resource costs. Specifically, a convolution kernel is rotated to different orientations for feature extractions of multiple channels. Meanwhile, much fewer kernels than previous works are used to ensure that the output channel does not increase. To further enhance the orthogonality of kernels in different orientations, we construct the non-maximum-suppression loss on the rotation dimension to suppress the other directions except the most activated one. Considering that the low-level-features benefit more from the rotational symmetry, we only share weights in the shallow layers(SWSL) via RGEC. Extensive experiments on multiple datasets(i.e., Image Net, CIFAR, and MNIST) demonstrate that SWSL can effectively benefit from the higher-degree weight sharing and improve the performances of various networks, including plain and Res Net architectures. Meanwhile, the convolutional kernels and parameters are much fewer(e.g., 75%, 87.5% fewer) in the shallow layers, and no extra computation costs are introduced.
文摘Computational prediction of in-hospital mortality in the setting of an intensive care unit can help clinical practitioners to guide care and make early decisions for interventions. As clinical data are complex and varied in their structure and components, continued innovation of modelling strategies is required to identify architectures that can best model outcomes. In this work, we trained a Heterogeneous Graph Model(HGM) on electronic health record(EHR) data and used the resulting embedding vector as additional information added to a Convolutional Neural Network(CNN) model for predicting in-hospital mortality. We show that the additional information provided by including time as a vector in the embedding captured the relationships between medical concepts, lab tests, and diagnoses, which enhanced predictive performance. We found that adding HGM to a CNN model increased the mortality prediction accuracy up to 4%. This framework served as a foundation for future experiments involving different EHR data types on important healthcare prediction tasks.