Unmanned Aerial Vehicle(UAV) navigation is aimed at guiding a UAV to the desired destinations along a collision-free and efficient path without human interventions, and it plays a crucial role in autonomous missions i...Unmanned Aerial Vehicle(UAV) navigation is aimed at guiding a UAV to the desired destinations along a collision-free and efficient path without human interventions, and it plays a crucial role in autonomous missions in harsh environments. The recently emerging Deep Reinforcement Learning(DRL) methods have shown promise for addressing the UAV navigation problem,but most of these methods cannot converge due to the massive amounts of interactive data when a UAV is navigating in high dynamic environments, where there are numerous obstacles moving fast.In this work, we propose an improved DRL-based method to tackle these fundamental limitations.To be specific, we develop a distributed DRL framework to decompose the UAV navigation task into two simpler sub-tasks, each of which is solved through the designed Long Short-Term Memory(LSTM) based DRL network by using only part of the interactive data. Furthermore, a clipped DRL loss function is proposed to closely stack the two sub-solutions into one integral for the UAV navigation problem. Extensive simulation results are provided to corroborate the superiority of the proposed method in terms of the convergence and effectiveness compared with those of the state-of-the-art DRL methods.展开更多
The autonomous navigation of an Unmanned Aerial Vehicle(UAV)relies heavily on the navigation sensors.The UAV’s level of autonomy depends upon the various navigation systems,such as state measurement,mapping,and obsta...The autonomous navigation of an Unmanned Aerial Vehicle(UAV)relies heavily on the navigation sensors.The UAV’s level of autonomy depends upon the various navigation systems,such as state measurement,mapping,and obstacle avoidance.Selecting the correct components is a critical part of the design process.However,this can be a particularly difficult task,especially for novices as there are several technologies and components available on the market,each with their own individual advantages and disadvantages.For example,satellite-based navigation components should be avoided when designing indoor UAVs.Incorporating them in the design brings no added value to the final product and will simply lead to increased cost and power consumption.Another issue is the number of vendors on the market,each trying to sell their hardware solutions which often incorporate similar technologies.The aim of this paper is to serve as a guide,proposing various methods to support the selection of fit-for-purpose technologies and components whilst avoiding system layout conflicts.The paper presents a study of the various navigation technologies and supports engineers in the selection of specific hardware solutions based on given requirements.The selection methods are based on easy-to-follow flow charts.A comparison of the various hardware components specifications is also included as part of this work.展开更多
In this paper a new reactive mechanism based on perception-action bionics for multi-sensory integration applied to Un- manned Aerial Vehicles (UAVs) navigation is proposed.The strategy is inspired by the olfactory bul...In this paper a new reactive mechanism based on perception-action bionics for multi-sensory integration applied to Un- manned Aerial Vehicles (UAVs) navigation is proposed.The strategy is inspired by the olfactory bulb neural activity observed in rabbits subject to external stimuli.The new UAV navigation technique exploits the use of a multiscroll chaotic system which is able to be controlled in real-time towards less complex orbits,like periodic orbits or equilibrium points,considered as perceptive orbits.These are subject to real-time modifications on the basis of environment changes acquired through a Synthetic Aperture Radar (SAR) sensory system.The mathematical details of the approach are given including simulation results in a virtual en- vironment.The results demonstrate the capability of autonomous navigation for UAV based on chaotic bionics theory in com- plex spatial environments.展开更多
In some military application scenarios,Unmanned Aerial Vehicles(UAVs)need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted,which brings challenge...In some military application scenarios,Unmanned Aerial Vehicles(UAVs)need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted,which brings challenges for UAV autonomous navigation and collision avoidance.In this paper,an improved deep-reinforcement-learning algorithm,Deep Q-Network with a Faster R-CNN model and a Data Deposit Mechanism(FRDDM-DQN),is proposed.A Faster R-CNN model(FR)is introduced and optimized to obtain the ability to extract obstacle information from images,and a new replay memory Data Deposit Mechanism(DDM)is designed to train an agent with a better performance.During training,a two-part training approach is used to reduce the time spent on training as well as retraining when the scenario changes.In order to verify the performance of the proposed method,a series of experiments,including training experiments,test experiments,and typical episodes experiments,is conducted in a 3D simulation environment.Experimental results show that the agent trained by the proposed FRDDM-DQN has the ability to navigate autonomously and avoid collisions,and performs better compared to the FRDQN,FR-DDQN,FR-Dueling DQN,YOLO-based YDDM-DQN,and original FR outputbased FR-ODQN.展开更多
As an advanced combat weapon,Unmanned Aerial Vehicles(UAVs)have been widely used in military wars.In this paper,we formulated the Autonomous Navigation Control(ANC)problem of UAVs as a Markov Decision Process(MDP)and ...As an advanced combat weapon,Unmanned Aerial Vehicles(UAVs)have been widely used in military wars.In this paper,we formulated the Autonomous Navigation Control(ANC)problem of UAVs as a Markov Decision Process(MDP)and proposed a novel Deep Reinforcement Learning(DRL)method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments.To solve the problem of limited training experience,the proposed Imaginary Filtered Hindsight Experience Replay(IFHER)generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences.The welldesigned goal,episode,and quality filtering strategies ensure that only high-quality augmented experiences can be stored,while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities.By training in a complex environment constructed based on the parameters of a real UAV,the proposed IFHER algorithm improves the convergence speed by 28.99%and the convergence result by 11.57%compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient(TD3)algorithm.The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent.Moreover,the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.展开更多
基金supported in part by the National Natural Science Foundation of China (Nos. 61671031, 61722102, and91738301)。
文摘Unmanned Aerial Vehicle(UAV) navigation is aimed at guiding a UAV to the desired destinations along a collision-free and efficient path without human interventions, and it plays a crucial role in autonomous missions in harsh environments. The recently emerging Deep Reinforcement Learning(DRL) methods have shown promise for addressing the UAV navigation problem,but most of these methods cannot converge due to the massive amounts of interactive data when a UAV is navigating in high dynamic environments, where there are numerous obstacles moving fast.In this work, we propose an improved DRL-based method to tackle these fundamental limitations.To be specific, we develop a distributed DRL framework to decompose the UAV navigation task into two simpler sub-tasks, each of which is solved through the designed Long Short-Term Memory(LSTM) based DRL network by using only part of the interactive data. Furthermore, a clipped DRL loss function is proposed to closely stack the two sub-solutions into one integral for the UAV navigation problem. Extensive simulation results are provided to corroborate the superiority of the proposed method in terms of the convergence and effectiveness compared with those of the state-of-the-art DRL methods.
文摘The autonomous navigation of an Unmanned Aerial Vehicle(UAV)relies heavily on the navigation sensors.The UAV’s level of autonomy depends upon the various navigation systems,such as state measurement,mapping,and obstacle avoidance.Selecting the correct components is a critical part of the design process.However,this can be a particularly difficult task,especially for novices as there are several technologies and components available on the market,each with their own individual advantages and disadvantages.For example,satellite-based navigation components should be avoided when designing indoor UAVs.Incorporating them in the design brings no added value to the final product and will simply lead to increased cost and power consumption.Another issue is the number of vendors on the market,each trying to sell their hardware solutions which often incorporate similar technologies.The aim of this paper is to serve as a guide,proposing various methods to support the selection of fit-for-purpose technologies and components whilst avoiding system layout conflicts.The paper presents a study of the various navigation technologies and supports engineers in the selection of specific hardware solutions based on given requirements.The selection methods are based on easy-to-follow flow charts.A comparison of the various hardware components specifications is also included as part of this work.
基金supported by the National High Technology Research and Development Program of China (863 Program) (2006AA12A108)"Multi-sensor Integrated Navigation in Aeronautics Field" the Ministry of Science and Technology of ChinaCSC International Scholarship (2008104769) Chinese Scholarship CouncilInternational Postgraduate Research Scholarship Program (2009800778591) from Australian Government.
文摘In this paper a new reactive mechanism based on perception-action bionics for multi-sensory integration applied to Un- manned Aerial Vehicles (UAVs) navigation is proposed.The strategy is inspired by the olfactory bulb neural activity observed in rabbits subject to external stimuli.The new UAV navigation technique exploits the use of a multiscroll chaotic system which is able to be controlled in real-time towards less complex orbits,like periodic orbits or equilibrium points,considered as perceptive orbits.These are subject to real-time modifications on the basis of environment changes acquired through a Synthetic Aperture Radar (SAR) sensory system.The mathematical details of the approach are given including simulation results in a virtual en- vironment.The results demonstrate the capability of autonomous navigation for UAV based on chaotic bionics theory in com- plex spatial environments.
文摘In some military application scenarios,Unmanned Aerial Vehicles(UAVs)need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted,which brings challenges for UAV autonomous navigation and collision avoidance.In this paper,an improved deep-reinforcement-learning algorithm,Deep Q-Network with a Faster R-CNN model and a Data Deposit Mechanism(FRDDM-DQN),is proposed.A Faster R-CNN model(FR)is introduced and optimized to obtain the ability to extract obstacle information from images,and a new replay memory Data Deposit Mechanism(DDM)is designed to train an agent with a better performance.During training,a two-part training approach is used to reduce the time spent on training as well as retraining when the scenario changes.In order to verify the performance of the proposed method,a series of experiments,including training experiments,test experiments,and typical episodes experiments,is conducted in a 3D simulation environment.Experimental results show that the agent trained by the proposed FRDDM-DQN has the ability to navigate autonomously and avoid collisions,and performs better compared to the FRDQN,FR-DDQN,FR-Dueling DQN,YOLO-based YDDM-DQN,and original FR outputbased FR-ODQN.
基金co-supported by the National Natural Science Foundation of China(Nos.62003267 and 61573285)the Natural Science Basic Research Plan in Shaanxi Province of China(No.2020JQ-220)+1 种基金the Open Project of Science and Technology on Electronic Information Control Laboratory,China(No.JS20201100339)the Open Project of Science and Technology on Electromagnetic Space Operations and Applications Laboratory,China(No.JS20210586512).
文摘As an advanced combat weapon,Unmanned Aerial Vehicles(UAVs)have been widely used in military wars.In this paper,we formulated the Autonomous Navigation Control(ANC)problem of UAVs as a Markov Decision Process(MDP)and proposed a novel Deep Reinforcement Learning(DRL)method to allow UAVs to perform dynamic target tracking tasks in large-scale unknown environments.To solve the problem of limited training experience,the proposed Imaginary Filtered Hindsight Experience Replay(IFHER)generates successful episodes by reasonably imagining the target trajectory in the failed episode to augment the experiences.The welldesigned goal,episode,and quality filtering strategies ensure that only high-quality augmented experiences can be stored,while the sampling filtering strategy of IFHER ensures that these stored augmented experiences can be fully learned according to their high priorities.By training in a complex environment constructed based on the parameters of a real UAV,the proposed IFHER algorithm improves the convergence speed by 28.99%and the convergence result by 11.57%compared to the state-of-the-art Twin Delayed Deep Deterministic Policy Gradient(TD3)algorithm.The testing experiments carried out in environments with different complexities demonstrate the strong robustness and generalization ability of the IFHER agent.Moreover,the flight trajectory of the IFHER agent shows the superiority of the learned policy and the practical application value of the algorithm.