Image captioning aims to generate a corresponding description of an image.In recent years,neural encoder-decodermodels have been the dominant approaches,in which the Convolutional Neural Network(CNN)and Long Short Ter...Image captioning aims to generate a corresponding description of an image.In recent years,neural encoder-decodermodels have been the dominant approaches,in which the Convolutional Neural Network(CNN)and Long Short TermMemory(LSTM)are used to translate an image into a natural language description.Among these approaches,the visual attention mechanisms are widely used to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.However,most conventional visual attention mechanisms are based on high-level image features,ignoring the effects of other image features,and giving insufficient consideration to the relative positions between image features.In this work,we propose a Position-Aware Transformer model with image-feature attention and position-aware attention mechanisms for the above problems.The image-feature attention firstly extracts multi-level features by using Feature Pyramid Network(FPN),then utilizes the scaled-dot-product to fuse these features,which enables our model to detect objects of different scales in the image more effectivelywithout increasing parameters.In the position-aware attentionmechanism,the relative positions between image features are obtained at first,afterwards the relative positions are incorporated into the original image features to generate captions more accurately.Experiments are carried out on the MSCOCO dataset and our approach achieves competitive BLEU-4,METEOR,ROUGE-L,CIDEr scores compared with some state-of-the-art approaches,demonstrating the effectiveness of our approach.展开更多
It is an effective method to broadcast the augmentation information of satellite navigation system using GEO technology.However,it becomes difficult to receive GEO signal in some special situation,for example in citie...It is an effective method to broadcast the augmentation information of satellite navigation system using GEO technology.However,it becomes difficult to receive GEO signal in some special situation,for example in cities or canyons,in which the signal will be sheltered by big buildings or mountains.In order to solve this problem,an Internet-based broadcast network has been proposed to utilize the infrastructure of the Internet to broadcast the augmentation information of satellite navigation system,which is based on application-layer multicast protocols.In this paper,a topology and position aware overlay network construction protocol is proposed to build the network for augmentation information of satellite navigation system.Simulation results show that the new algorithm is able to achieve better performance in terms of delay,depth and degree utilization.展开更多
基金This work was supported in part by the National Natural Science Foundation of China under Grant No.61977018the Deanship of Scientific Research at King Saud University,Riyadh,Saudi Arabia for funding this work through research Group No.RG-1438-070in part by the Research Foundation of Education Bureau of Hunan Province of China under Grant 16B006.
文摘Image captioning aims to generate a corresponding description of an image.In recent years,neural encoder-decodermodels have been the dominant approaches,in which the Convolutional Neural Network(CNN)and Long Short TermMemory(LSTM)are used to translate an image into a natural language description.Among these approaches,the visual attention mechanisms are widely used to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.However,most conventional visual attention mechanisms are based on high-level image features,ignoring the effects of other image features,and giving insufficient consideration to the relative positions between image features.In this work,we propose a Position-Aware Transformer model with image-feature attention and position-aware attention mechanisms for the above problems.The image-feature attention firstly extracts multi-level features by using Feature Pyramid Network(FPN),then utilizes the scaled-dot-product to fuse these features,which enables our model to detect objects of different scales in the image more effectivelywithout increasing parameters.In the position-aware attentionmechanism,the relative positions between image features are obtained at first,afterwards the relative positions are incorporated into the original image features to generate captions more accurately.Experiments are carried out on the MSCOCO dataset and our approach achieves competitive BLEU-4,METEOR,ROUGE-L,CIDEr scores compared with some state-of-the-art approaches,demonstrating the effectiveness of our approach.
基金supported by National High Technical Research and Development Program of China (863 Program) under Grant No. 2009AA12Z322
文摘It is an effective method to broadcast the augmentation information of satellite navigation system using GEO technology.However,it becomes difficult to receive GEO signal in some special situation,for example in cities or canyons,in which the signal will be sheltered by big buildings or mountains.In order to solve this problem,an Internet-based broadcast network has been proposed to utilize the infrastructure of the Internet to broadcast the augmentation information of satellite navigation system,which is based on application-layer multicast protocols.In this paper,a topology and position aware overlay network construction protocol is proposed to build the network for augmentation information of satellite navigation system.Simulation results show that the new algorithm is able to achieve better performance in terms of delay,depth and degree utilization.