The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identifica...The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identification capability of spoofed speech detection,this paper considers the research on features.Firstly,following the idea of modifying the constant-Q-based features,this work considered adding variance or mean to the constant-Q-based cepstral domain to obtain good performance.Secondly,linear frequency cepstral coefficients(LFCCs)performed comparably with constant-Q-based features.Finally,we proposed linear frequency variance-based cepstral coefficients(LVCCs)and linear frequency mean-based cepstral coefficients(LMCCs)for identification of speech spoofing.LVCCs and LMCCs could be attained by adding the frame variance or the mean to the log magnitude spectrum based on LFCC features.The proposed novel features were evaluated on ASVspoof 2019 datase.The experimental results show that compared with known hand-crafted features,LVCCs and LMCCs are more effective in resisting spoofed speech attack.展开更多
Evaluation of calligraphic copy is the core of Chinese calligraphy appreciation and in-heritance.However,previous aesthetic evaluation studies often focussed on photos and paintings,with few attempts on Chinese callig...Evaluation of calligraphic copy is the core of Chinese calligraphy appreciation and in-heritance.However,previous aesthetic evaluation studies often focussed on photos and paintings,with few attempts on Chinese calligraphy.To solve this problem,a Siamese regression aesthetic fusion method is proposed,named SRAFE,for Chinese calligraphy based on the combination of calligraphy aesthetics and deep learning.First,a dataset termed Evaluated Chinese Calligraphy Copies(E3C)is constructed for aesthetic evalu-ation.Second,12 hand‐crafted aesthetic features based on the shape,structure,and stroke of calligraphy are designed.Then,the Siamese regression network(SRN)is designed to extract the deep aesthetic representation of calligraphy.Finally,the SRAFE method is built by fusing the deep aesthetic features with the hand‐crafted aesthetic features.Experimental results show that scores given by SRAFE are similar to the aesthetic evaluation label of E3C,proving the effectiveness of the authors’method.展开更多
The well-established mortality rates due to lung cancers,scarcity of radiology experts and inter-observer variability underpin the dire need for robust and accurate computer aided diagnostics to provide a second opini...The well-established mortality rates due to lung cancers,scarcity of radiology experts and inter-observer variability underpin the dire need for robust and accurate computer aided diagnostics to provide a second opinion.To this end,we propose a feature grafting approach to classify lung cancer images from publicly available National Institute of Health(NIH)chest X-Ray dataset comprised of 30,805 unique patients.The performance of transfer learning with pre-trained VGG and Inception models is evaluated in comparison against manually extracted radiomics features added to convolutional neural network using custom layer.For classification with both approaches,Support VectorsMachines(SVM)are used.The results from the 5-fold cross validation report Area Under Curve(AUC)of 0.92 and accuracy of 96.87%in detecting lung nodules with the proposed method.This is a plausible improvement against the observed accuracy of transfer learning using Inception(79.87%).The specificity of allmethods is>99%,however,the sensitivity of the proposed method(97.24%)surpasses that of transfer learning approaches(<67%).Furthermore,it is observed that the true positive rate with SVM is highest at the same false-positive rate in experiments amongst Random Forests,Decision Trees,and K-Nearest Neighbor classifiers.Hence,the proposed approach can be used in clinical and research environments to provide second opinions very close to the experts’intuition.展开更多
Despite convolutional neural network(CNN) is mature in many domains, the understanding of the directions where the parameters of the CNNs are learned towards, falls behind, and researches on the functions that the con...Despite convolutional neural network(CNN) is mature in many domains, the understanding of the directions where the parameters of the CNNs are learned towards, falls behind, and researches on the functions that the convolutional networks(ConvNets) learns are difficult to be explored. A method is proposed to guide ConvNets to learn towards the expected direction. First, for the sake of facilitating network converging, a novel feature enhancement framework, namely enhancement network(EN), is devised to learn parameters according to certain rules. Second, two types of hand-crafted rules, namely feature-sharpening(FS) and feature-amplifying(FA) are proposed to enable effective ENs, meanwhile are embedded into the CNN for the end-to-end learning. Specifically, the former is a tool sharpening convolutional features and the latter is the one amplifying convolutional features linearly. Both tools aim at the same spot achieving a stronger inductive bias and more straightforward loss functions. Finally, the experiments are conducted on the mixed National Institute of Standards and Technology(MNIST) and cooperative institute for Alaska research 10(CIFAR10) dataset. Experimental results demonstrate that ENs make a faster convergence by formulating hand-crafted rules.展开更多
作为一种跨摄像头的检索任务,行人重识别会受到不同相机视角造成的图像样式变化的影响。近年来,许多算法通过神经网络直接从原始输入图片中学习相应特征,虽然这些特征能够很好地描述全局行人,但忽略了许多局部细节信息,在复杂的场景下...作为一种跨摄像头的检索任务,行人重识别会受到不同相机视角造成的图像样式变化的影响。近年来,许多算法通过神经网络直接从原始输入图片中学习相应特征,虽然这些特征能够很好地描述全局行人,但忽略了许多局部细节信息,在复杂的场景下容易出现误识别。针对此问题,研究了一种基于多任务学习的新的特征表示方法,采用成对输入的孪生网络结构,将局部最大出现特征(local maximal occurrence,LOMO)和深层特征一起送入网络并映射到单一的特征空间中进行训练,形成一种新的网络模型TDFN(traditional and deep features fusion network)。利用神经网络自我学习特性,联合多个任务的损失函数更新网络,使得深层特征学习到更多与手工局部特征互补的细节信息。实验表明,新特征的平均精度mAP和Rank-1精度都优于直接从孪生网络提取的全局描述子。展开更多
闭环检测是同时定位与建图(Simultaneous localization and mapping,SLAM)的重要组成部分,能够有效减小SLAM系统中的累积误差,并且如果在定位与建图过程中跟踪丢失,还可以利用闭环检测进行重定位。与传统的手动设计的特征(hand-crafted ...闭环检测是同时定位与建图(Simultaneous localization and mapping,SLAM)的重要组成部分,能够有效减小SLAM系统中的累积误差,并且如果在定位与建图过程中跟踪丢失,还可以利用闭环检测进行重定位。与传统的手动设计的特征(hand-crafted feature)相比,从神经网络中学习到的图像特征具有更好的环境不变性和语义识别能力。考虑到基于陆标(landmark)的卷积特征能够克服整个图像特征对视点变化敏感的缺陷,文中提出了一种新的闭环检测算法。其首先通过卷积神经网络的卷积层直接识别出图像的感兴趣区域生成陆标,然后对图像中识别出的每个陆标提取卷积特征,生成图像的最终表示以检测闭环。为了验证算法的有效性,在典型的数据集上进行了对比实验,结果表明所提算法具有优异的性能,且即使是在极端的视点和外观变化的情况下仍然具有高鲁棒性。展开更多
基金National Natural Science Foundation of China(No.62001100)。
文摘The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identification capability of spoofed speech detection,this paper considers the research on features.Firstly,following the idea of modifying the constant-Q-based features,this work considered adding variance or mean to the constant-Q-based cepstral domain to obtain good performance.Secondly,linear frequency cepstral coefficients(LFCCs)performed comparably with constant-Q-based features.Finally,we proposed linear frequency variance-based cepstral coefficients(LVCCs)and linear frequency mean-based cepstral coefficients(LMCCs)for identification of speech spoofing.LVCCs and LMCCs could be attained by adding the frame variance or the mean to the log magnitude spectrum based on LFCC features.The proposed novel features were evaluated on ASVspoof 2019 datase.The experimental results show that compared with known hand-crafted features,LVCCs and LMCCs are more effective in resisting spoofed speech attack.
文摘Evaluation of calligraphic copy is the core of Chinese calligraphy appreciation and in-heritance.However,previous aesthetic evaluation studies often focussed on photos and paintings,with few attempts on Chinese calligraphy.To solve this problem,a Siamese regression aesthetic fusion method is proposed,named SRAFE,for Chinese calligraphy based on the combination of calligraphy aesthetics and deep learning.First,a dataset termed Evaluated Chinese Calligraphy Copies(E3C)is constructed for aesthetic evalu-ation.Second,12 hand‐crafted aesthetic features based on the shape,structure,and stroke of calligraphy are designed.Then,the Siamese regression network(SRN)is designed to extract the deep aesthetic representation of calligraphy.Finally,the SRAFE method is built by fusing the deep aesthetic features with the hand‐crafted aesthetic features.Experimental results show that scores given by SRAFE are similar to the aesthetic evaluation label of E3C,proving the effectiveness of the authors’method.
文摘The well-established mortality rates due to lung cancers,scarcity of radiology experts and inter-observer variability underpin the dire need for robust and accurate computer aided diagnostics to provide a second opinion.To this end,we propose a feature grafting approach to classify lung cancer images from publicly available National Institute of Health(NIH)chest X-Ray dataset comprised of 30,805 unique patients.The performance of transfer learning with pre-trained VGG and Inception models is evaluated in comparison against manually extracted radiomics features added to convolutional neural network using custom layer.For classification with both approaches,Support VectorsMachines(SVM)are used.The results from the 5-fold cross validation report Area Under Curve(AUC)of 0.92 and accuracy of 96.87%in detecting lung nodules with the proposed method.This is a plausible improvement against the observed accuracy of transfer learning using Inception(79.87%).The specificity of allmethods is>99%,however,the sensitivity of the proposed method(97.24%)surpasses that of transfer learning approaches(<67%).Furthermore,it is observed that the true positive rate with SVM is highest at the same false-positive rate in experiments amongst Random Forests,Decision Trees,and K-Nearest Neighbor classifiers.Hence,the proposed approach can be used in clinical and research environments to provide second opinions very close to the experts’intuition.
基金supported by the Natural Science Foundation of Universities of Anhui Province (KJ2019A1168)Excellent Young Talent Support Project of Auhui Province (gxyq2020109)。
文摘Despite convolutional neural network(CNN) is mature in many domains, the understanding of the directions where the parameters of the CNNs are learned towards, falls behind, and researches on the functions that the convolutional networks(ConvNets) learns are difficult to be explored. A method is proposed to guide ConvNets to learn towards the expected direction. First, for the sake of facilitating network converging, a novel feature enhancement framework, namely enhancement network(EN), is devised to learn parameters according to certain rules. Second, two types of hand-crafted rules, namely feature-sharpening(FS) and feature-amplifying(FA) are proposed to enable effective ENs, meanwhile are embedded into the CNN for the end-to-end learning. Specifically, the former is a tool sharpening convolutional features and the latter is the one amplifying convolutional features linearly. Both tools aim at the same spot achieving a stronger inductive bias and more straightforward loss functions. Finally, the experiments are conducted on the mixed National Institute of Standards and Technology(MNIST) and cooperative institute for Alaska research 10(CIFAR10) dataset. Experimental results demonstrate that ENs make a faster convergence by formulating hand-crafted rules.
文摘作为一种跨摄像头的检索任务,行人重识别会受到不同相机视角造成的图像样式变化的影响。近年来,许多算法通过神经网络直接从原始输入图片中学习相应特征,虽然这些特征能够很好地描述全局行人,但忽略了许多局部细节信息,在复杂的场景下容易出现误识别。针对此问题,研究了一种基于多任务学习的新的特征表示方法,采用成对输入的孪生网络结构,将局部最大出现特征(local maximal occurrence,LOMO)和深层特征一起送入网络并映射到单一的特征空间中进行训练,形成一种新的网络模型TDFN(traditional and deep features fusion network)。利用神经网络自我学习特性,联合多个任务的损失函数更新网络,使得深层特征学习到更多与手工局部特征互补的细节信息。实验表明,新特征的平均精度mAP和Rank-1精度都优于直接从孪生网络提取的全局描述子。
文摘闭环检测是同时定位与建图(Simultaneous localization and mapping,SLAM)的重要组成部分,能够有效减小SLAM系统中的累积误差,并且如果在定位与建图过程中跟踪丢失,还可以利用闭环检测进行重定位。与传统的手动设计的特征(hand-crafted feature)相比,从神经网络中学习到的图像特征具有更好的环境不变性和语义识别能力。考虑到基于陆标(landmark)的卷积特征能够克服整个图像特征对视点变化敏感的缺陷,文中提出了一种新的闭环检测算法。其首先通过卷积神经网络的卷积层直接识别出图像的感兴趣区域生成陆标,然后对图像中识别出的每个陆标提取卷积特征,生成图像的最终表示以检测闭环。为了验证算法的有效性,在典型的数据集上进行了对比实验,结果表明所提算法具有优异的性能,且即使是在极端的视点和外观变化的情况下仍然具有高鲁棒性。