Obtaining absolute pose based on pre-loaded satellite images is one of the important means of autonomous navigation for small Unmanned Aerial Vehicles(UAVs)in Global Navigation Satellite System(GNSS)denied environment...Obtaining absolute pose based on pre-loaded satellite images is one of the important means of autonomous navigation for small Unmanned Aerial Vehicles(UAVs)in Global Navigation Satellite System(GNSS)denied environments.Most of the previous works have tended to build Convolutional Neural Networks(CNNs)to extract features and then directly regress the pose,which will fail when solving the challenges caused by the huge viewpoint and size differences between“UAV-satellite”image pairs in real-world scenarios.Therefore,this paper proposes a probability distribution/regression integrated deep model with the attention-guided triple fusion mechanism,which estimates discrete distributions in pose space and three-dimensional vectors in translation space.In order to overcome the shortage of the relevant dataset,this paper simulates image datasets captured by UAVs with forward-facing cameras during target searching and autonomous attacking.The effectiveness,superiority,and robustness of the proposed method are verified by simulated datasets and flight tests.展开更多
Autonomous navigation for intelligent mobile robots has gained significant attention,with a focus on enabling robots to generate reliable policies based on maintenance of spatial memory.In this paper,we propose a lear...Autonomous navigation for intelligent mobile robots has gained significant attention,with a focus on enabling robots to generate reliable policies based on maintenance of spatial memory.In this paper,we propose a learning-based visual navigation pipeline that uses topological maps as memory configurations.We introduce a unique online topology construction approach that fuses odometry pose estimation and perceptual similarity estimation.This tackles the issues of topological node redundancy and incorrect edge connections,which stem from the distribution gap between the spatial and perceptual domains.Furthermore,we propose a differentiable graph extraction structure,the topology multi-factor transformer(TMFT).This structure utilizes graph neural networks to integrate global memory and incorporates a multi-factor attention mechanism to underscore elements closely related to relevant target cues for policy generation.Results from photorealistic simulations on image-goal navigation tasks highlight the superior navigation performance of our proposed pipeline compared to existing memory structures.Comprehensive validation through behavior visualization,interpretability tests,and real-world deployment further underscore the adapt-ability and efficacy of our method.展开更多
基金supported by the National Natural Science Foundation of China(No.61973033)the Chongqing Natural Science Foundation,China(No.cstc2021jcyjmsxmX0737).
文摘Obtaining absolute pose based on pre-loaded satellite images is one of the important means of autonomous navigation for small Unmanned Aerial Vehicles(UAVs)in Global Navigation Satellite System(GNSS)denied environments.Most of the previous works have tended to build Convolutional Neural Networks(CNNs)to extract features and then directly regress the pose,which will fail when solving the challenges caused by the huge viewpoint and size differences between“UAV-satellite”image pairs in real-world scenarios.Therefore,this paper proposes a probability distribution/regression integrated deep model with the attention-guided triple fusion mechanism,which estimates discrete distributions in pose space and three-dimensional vectors in translation space.In order to overcome the shortage of the relevant dataset,this paper simulates image datasets captured by UAVs with forward-facing cameras during target searching and autonomous attacking.The effectiveness,superiority,and robustness of the proposed method are verified by simulated datasets and flight tests.
基金supported in part by the National Natural Science Foundation of China (62225309,62073222,U21A20480,62361166632)。
文摘Autonomous navigation for intelligent mobile robots has gained significant attention,with a focus on enabling robots to generate reliable policies based on maintenance of spatial memory.In this paper,we propose a learning-based visual navigation pipeline that uses topological maps as memory configurations.We introduce a unique online topology construction approach that fuses odometry pose estimation and perceptual similarity estimation.This tackles the issues of topological node redundancy and incorrect edge connections,which stem from the distribution gap between the spatial and perceptual domains.Furthermore,we propose a differentiable graph extraction structure,the topology multi-factor transformer(TMFT).This structure utilizes graph neural networks to integrate global memory and incorporates a multi-factor attention mechanism to underscore elements closely related to relevant target cues for policy generation.Results from photorealistic simulations on image-goal navigation tasks highlight the superior navigation performance of our proposed pipeline compared to existing memory structures.Comprehensive validation through behavior visualization,interpretability tests,and real-world deployment further underscore the adapt-ability and efficacy of our method.