A quantum secure direct communication protocol with cluster states is proposed.Compared with the deterministic secure quantum communication protocol with the cluster state proposed by Yuan and Song(Int.J.Quant.Inform....A quantum secure direct communication protocol with cluster states is proposed.Compared with the deterministic secure quantum communication protocol with the cluster state proposed by Yuan and Song(Int.J.Quant.Inform.,2009,7:689),this protocol can achieve higher intrinsic efficiency by using two-step transmission.The implementation of this protocol is also discussed.展开更多
We present two deterministic secure quantum communication schemes over a collective-noise. One is used to complete the secure quantum communication against a collective-rotation noise and the other is used against a c...We present two deterministic secure quantum communication schemes over a collective-noise. One is used to complete the secure quantum communication against a collective-rotation noise and the other is used against a collective-dephasing noise. The two parties of quantum communication can exploit the correlation of their subsystems to check eavesdropping efficiently. Although the sender should prepare a sequence of three-photon entangled states for accomplishing secure communication against a collective noise,the two parties need only single-photon measurements,rather than Bell-state measurements,which will make our schemes convenient in practical application.展开更多
Mobile Edge Computing(MEC)is one of the most promising techniques for next-generation wireless communication systems.In this paper,we study the problem of dynamic caching,computation offloading,and resource allocation...Mobile Edge Computing(MEC)is one of the most promising techniques for next-generation wireless communication systems.In this paper,we study the problem of dynamic caching,computation offloading,and resource allocation in cache-assisted multi-user MEC systems with stochastic task arrivals.There are multiple computationally intensive tasks in the system,and each Mobile User(MU)needs to execute a task either locally or remotely in one or more MEC servers by offloading the task data.Popular tasks can be cached in MEC servers to avoid duplicates in offloading.The cached contents can be either obtained through user offloading,fetched from a remote cloud,or fetched from another MEC server.The objective is to minimize the long-term average of a cost function,which is defined as a weighted sum of energy consumption,delay,and cache contents’fetching costs.The weighting coefficients associated with the different metrics in the objective function can be adjusted to balance the tradeoff among them.The optimum design is performed with respect to four decision parameters:whether to cache a given task,whether to offload a given uncached task,how much transmission power should be used during offloading,and how much MEC resources to be allocated for executing a task.We propose to solve the problems by developing a dynamic scheduling policy based on Deep Reinforcement Learning(DRL)with the Deep Deterministic Policy Gradient(DDPG)method.A new decentralized DDPG algorithm is developed to obtain the optimum designs for multi-cell MEC systems by leveraging on the cooperations among neighboring MEC servers.Simulation results demonstrate that the proposed algorithm outperforms other existing strategies,such as Deep Q-Network(DQN).展开更多
Unmanned Aerial Vehicles(UAVs)play a vital role in military warfare.In a variety of battlefield mission scenarios,UAVs are required to safely fly to designated locations without human intervention.Therefore,finding a ...Unmanned Aerial Vehicles(UAVs)play a vital role in military warfare.In a variety of battlefield mission scenarios,UAVs are required to safely fly to designated locations without human intervention.Therefore,finding a suitable method to solve the UAV Autonomous Motion Planning(AMP)problem can improve the success rate of UAV missions to a certain extent.In recent years,many studies have used Deep Reinforcement Learning(DRL)methods to address the AMP problem and have achieved good results.From the perspective of sampling,this paper designs a sampling method with double-screening,combines it with the Deep Deterministic Policy Gradient(DDPG)algorithm,and proposes the Relevant Experience Learning-DDPG(REL-DDPG)algorithm.The REL-DDPG algorithm uses a Prioritized Experience Replay(PER)mechanism to break the correlation of continuous experiences in the experience pool,finds the experiences most similar to the current state to learn according to the theory in human education,and expands the influence of the learning process on action selection at the current state.All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV.The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm,while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.展开更多
The ever-changing battlefield environment requires the use of robust and adaptive technologies integrated into a reliable platform. Unmanned combat aerial vehicles(UCAVs) aim to integrate such advanced technologies wh...The ever-changing battlefield environment requires the use of robust and adaptive technologies integrated into a reliable platform. Unmanned combat aerial vehicles(UCAVs) aim to integrate such advanced technologies while increasing the tactical capabilities of combat aircraft. As a research object, common UCAV uses the neural network fitting strategy to obtain values of attack areas. However, this simple strategy cannot cope with complex environmental changes and autonomously optimize decision-making problems. To solve the problem, this paper proposes a new deep deterministic policy gradient(DDPG) strategy based on deep reinforcement learning for the attack area fitting of UCAVs in the future battlefield. Simulation results show that the autonomy and environmental adaptability of UCAVs in the future battlefield will be improved based on the new DDPG algorithm and the training process converges quickly. We can obtain the optimal values of attack areas in real time during the whole flight with the well-trained deep network.展开更多
A novel efficient deterministic secure quantum communication scheme based on four-qubit cluster states and single-photon identity authentication is proposed. In this scheme, the two authenticated users can transmit tw...A novel efficient deterministic secure quantum communication scheme based on four-qubit cluster states and single-photon identity authentication is proposed. In this scheme, the two authenticated users can transmit two bits of classical information per cluster state, and its efficiency of the quantum communication is 1/3, which is approximately 1.67 times that of the previous protocol presented by Wang et al [Chin. Phys. Lett. 23 (2006) 2658]. Security analysis shows the present scheme is secure against intercept-resend attack and the impersonator's attack. Furthermore, it is more economic with present-day techniques and easily processed by a one-way quantum computer.展开更多
Gravity-aided inertial navigation is a hot issue in the applications of underwater autonomous vehicle(UAV). Since the matching process is conducted with a gravity anomaly database tabulated in the form of a digital mo...Gravity-aided inertial navigation is a hot issue in the applications of underwater autonomous vehicle(UAV). Since the matching process is conducted with a gravity anomaly database tabulated in the form of a digital model and the resolution is 2’ × 2’,a filter model based on vehicle position is derived and the particularity of inertial navigation system(INS) output is employed to estimate a parameter in the system model. Meanwhile, the matching algorithm based on point mass filter(PMF) is applied and several optimal selection strategies are discussed. It is obtained that the point mass filter algorithm based on the deterministic resampling method has better practicability. The reliability and the accuracy of the algorithm are verified via simulation tests.展开更多
In response to the production capacity and functionality variations, a genetic algorithm (GA) embedded with deterministic timed Petri nets(DTPN) for reconfigurable production line(RPL) is proposed to solve its s...In response to the production capacity and functionality variations, a genetic algorithm (GA) embedded with deterministic timed Petri nets(DTPN) for reconfigurable production line(RPL) is proposed to solve its scheduling problem. The basic DTPN modules are presented to model the corresponding variable structures in RPL, and then the scheduling model of the whole RPL is constructed. And in the scheduling algorithm, firing sequences of the Petri nets model are used as chromosomes, thus the selection, crossover, and mutation operator do not deal with the elements in the problem space, but the elements of Petri nets model. Accordingly, all the algorithms for GA operations embedded with Petri nets model are proposed. Moreover, the new weighted single-objective optimization based on reconfiguration cost and E/T is used. The results of a DC motor RPL scheduling suggest that the presented DTPN-GA scheduling algorithm has a significant impact on RPL scheduling, and provide obvious improvements over the conventional scheduling method in practice that meets duedate, minimizes reconfiguration cost, and enhances cost effectivity.展开更多
The Pacific decadal and interdecadal oscillation (PDO) has been extensively explored in recent decades because of its profound impact on global climate systems. It is a long-lived ENSO-like pattern of Pacific climate ...The Pacific decadal and interdecadal oscillation (PDO) has been extensively explored in recent decades because of its profound impact on global climate systems. It is a long-lived ENSO-like pattern of Pacific climate variability with a period of 10-30 years. The general picture is that the anomalously warm (cool) SSTs in the central North Pacific are always accompanied by the anomalously cool (warm) SSTs along the west coast of America and in the central east tropical Pacific with comparable amplitude. In general, there are two classes of opinions on the origin of this low-frequency climate variability, one thinking that it results from deterministically coupled modes of the Pacific ocean-atmosphere system, and the other, from stochastic atmospheric forcing. The deterministic origin emphasizes that the internal physical processes in an air-sea system can provide a positive feedback mechanism to amplify an initial perturbation, and a negative feedback mechanism to reverse the phase of oscillation. The dynamic evolution of ocean circulation determines the timescale of the oscillation. The stochastic origin, however, emphasizes that because the atmospheric activities can be thought as having no preferred timescale and are associated with an essentially white noise spectrum, tne ocean response can manifest a red peak in a certain low frequency range with a decadal to interdecadal timescale. In this paper, the authors try to systematically understand the state of the art of observational, theoretical and numerical studies on the PDO and hope to provide a useful background reference for current research.展开更多
This study proposes a new coding function for the symmetric W state. Based on the new coding function, a theoretical protocol of deterministic quanama communication (DQC) is proposed. The sender can use the proposed...This study proposes a new coding function for the symmetric W state. Based on the new coding function, a theoretical protocol of deterministic quanama communication (DQC) is proposed. The sender can use the proposed coding function to encode his/her message, and the receiver can perform the imperfect Bell measurement to obtain the sender's message. In comparison to the existing DQC protocols that also use the W class state, the proposed protocol is more efficient and also more practical within today's technology. Moreover, the security of this protocol is analyzed to show that any eavesdropper will be detected with a very high probability under both the ideal and the noisy quantum channel.展开更多
In this paper,a day-ahead electricity market bidding problem with multiple strategic generation company(GEN-CO)bidders is studied.The problem is formulated as a Markov game model,where GENCO bidders interact with each...In this paper,a day-ahead electricity market bidding problem with multiple strategic generation company(GEN-CO)bidders is studied.The problem is formulated as a Markov game model,where GENCO bidders interact with each other to develop their optimal day-ahead bidding strategies.Considering unobservable information in the problem,a model-free and data-driven approach,known as multi-agent deep deterministic policy gradient(MADDPG),is applied for approximating the Nash equilibrium(NE)in the above Markov game.The MAD-DPG algorithm has the advantage of generalization due to the automatic feature extraction ability of the deep neural networks.The algorithm is tested on an IEEE 30-bus system with three competitive GENCO bidders in both an uncongested case and a congested case.Comparisons with a truthful bidding strategy and state-of-the-art deep reinforcement learning methods including deep Q network and deep deterministic policy gradient(DDPG)demonstrate that the applied MADDPG algorithm can find a superior bidding strategy for all the market participants with increased profit gains.In addition,the comparison with a conventional-model-based method shows that the MADDPG algorithm has higher computational efficiency,which is feasible for real-world applications.展开更多
Unmanned aerial vehicles(UAVs)have been extensively used in civil and industrial applications due to the rapid development of the guidance,navigation and control(GNC)technologies.Especially,using deep reinforcement le...Unmanned aerial vehicles(UAVs)have been extensively used in civil and industrial applications due to the rapid development of the guidance,navigation and control(GNC)technologies.Especially,using deep reinforcement learning methods for motion control acquires a major progress recently,since deep Q-learning algorithm has been successfully applied to the continuous action domain problem.This paper proposes an improved deep deterministic policy gradient(DDPG)algorithm for path following control problem of UAV.A speci-c reward function is designed for minimizing the cross-track error of the path following problem.In the training phase,a double experience replay bu®er(DERB)is used to increase the learning e±ciency and accelerate the convergence speed.First,the model of UAV path following problem has been established.After that,the framework of DDPG algorithm is constructed.Then the state space,action space and reward function of the UAV path following algorithm are designed.DERB is proposed to accelerate the training phase.Finally,simulation results are carried out to show the e®ectiveness of the proposed DERB–DDPG method.展开更多
基金supported by the National High-Tech Research,Development Plan of China (Grant No. 2009AA01Z441)the National Natural Science Foundation of China (Grant Nos. 60873191 and 60821001)+1 种基金the Specialized Research Fund for the Doctoral Program of Higher Education(Grant Nos. 20091103120014 and 20090005110010)the Beijing Natural Science Foundation (Grant Nos. 1093015,1102004)
文摘A quantum secure direct communication protocol with cluster states is proposed.Compared with the deterministic secure quantum communication protocol with the cluster state proposed by Yuan and Song(Int.J.Quant.Inform.,2009,7:689),this protocol can achieve higher intrinsic efficiency by using two-step transmission.The implementation of this protocol is also discussed.
基金Supported by the National Natural Science Foundation of China (Grant No. 10847147)the Science Foundation of Nanjing University ofInformation Science & Technology (Grant No. 20080279)
文摘We present two deterministic secure quantum communication schemes over a collective-noise. One is used to complete the secure quantum communication against a collective-rotation noise and the other is used against a collective-dephasing noise. The two parties of quantum communication can exploit the correlation of their subsystems to check eavesdropping efficiently. Although the sender should prepare a sequence of three-photon entangled states for accomplishing secure communication against a collective noise,the two parties need only single-photon measurements,rather than Bell-state measurements,which will make our schemes convenient in practical application.
文摘Mobile Edge Computing(MEC)is one of the most promising techniques for next-generation wireless communication systems.In this paper,we study the problem of dynamic caching,computation offloading,and resource allocation in cache-assisted multi-user MEC systems with stochastic task arrivals.There are multiple computationally intensive tasks in the system,and each Mobile User(MU)needs to execute a task either locally or remotely in one or more MEC servers by offloading the task data.Popular tasks can be cached in MEC servers to avoid duplicates in offloading.The cached contents can be either obtained through user offloading,fetched from a remote cloud,or fetched from another MEC server.The objective is to minimize the long-term average of a cost function,which is defined as a weighted sum of energy consumption,delay,and cache contents’fetching costs.The weighting coefficients associated with the different metrics in the objective function can be adjusted to balance the tradeoff among them.The optimum design is performed with respect to four decision parameters:whether to cache a given task,whether to offload a given uncached task,how much transmission power should be used during offloading,and how much MEC resources to be allocated for executing a task.We propose to solve the problems by developing a dynamic scheduling policy based on Deep Reinforcement Learning(DRL)with the Deep Deterministic Policy Gradient(DDPG)method.A new decentralized DDPG algorithm is developed to obtain the optimum designs for multi-cell MEC systems by leveraging on the cooperations among neighboring MEC servers.Simulation results demonstrate that the proposed algorithm outperforms other existing strategies,such as Deep Q-Network(DQN).
基金co-supported by the National Natural Science Foundation of China(Nos.62003267,61573285)the Aeronautical Science Foundation of China(ASFC)(No.20175553027)Natural Science Basic Research Plan in Shaanxi Province of China(No.2020JQ-220)。
文摘Unmanned Aerial Vehicles(UAVs)play a vital role in military warfare.In a variety of battlefield mission scenarios,UAVs are required to safely fly to designated locations without human intervention.Therefore,finding a suitable method to solve the UAV Autonomous Motion Planning(AMP)problem can improve the success rate of UAV missions to a certain extent.In recent years,many studies have used Deep Reinforcement Learning(DRL)methods to address the AMP problem and have achieved good results.From the perspective of sampling,this paper designs a sampling method with double-screening,combines it with the Deep Deterministic Policy Gradient(DDPG)algorithm,and proposes the Relevant Experience Learning-DDPG(REL-DDPG)algorithm.The REL-DDPG algorithm uses a Prioritized Experience Replay(PER)mechanism to break the correlation of continuous experiences in the experience pool,finds the experiences most similar to the current state to learn according to the theory in human education,and expands the influence of the learning process on action selection at the current state.All experiments are applied in a complex unknown simulation environment constructed based on the parameters of a real UAV.The training experiments show that REL-DDPG improves the convergence speed and the convergence result compared to the state-of-the-art DDPG algorithm,while the testing experiments show the applicability of the algorithm and investigate the performance under different parameter conditions.
基金supported by the Key Laboratory of Defense Science and Technology Foundation of Luoyang Electro-optical Equipment Research Institute(6142504200108)。
文摘The ever-changing battlefield environment requires the use of robust and adaptive technologies integrated into a reliable platform. Unmanned combat aerial vehicles(UCAVs) aim to integrate such advanced technologies while increasing the tactical capabilities of combat aircraft. As a research object, common UCAV uses the neural network fitting strategy to obtain values of attack areas. However, this simple strategy cannot cope with complex environmental changes and autonomously optimize decision-making problems. To solve the problem, this paper proposes a new deep deterministic policy gradient(DDPG) strategy based on deep reinforcement learning for the attack area fitting of UCAVs in the future battlefield. Simulation results show that the autonomy and environmental adaptability of UCAVs in the future battlefield will be improved based on the new DDPG algorithm and the training process converges quickly. We can obtain the optimal values of attack areas in real time during the whole flight with the well-trained deep network.
基金Project supported by the National Natural Science Foundation of China (Grant Nos 60572071 and 60873101)Natural Science Foundation of Jiangsu Province (Grant Nos BM2006504, BK2007104 and BK2008209)College Natural Science Foundation of Jiangsu Province (Grant No 06KJB520137)
文摘A novel efficient deterministic secure quantum communication scheme based on four-qubit cluster states and single-photon identity authentication is proposed. In this scheme, the two authenticated users can transmit two bits of classical information per cluster state, and its efficiency of the quantum communication is 1/3, which is approximately 1.67 times that of the previous protocol presented by Wang et al [Chin. Phys. Lett. 23 (2006) 2658]. Security analysis shows the present scheme is secure against intercept-resend attack and the impersonator's attack. Furthermore, it is more economic with present-day techniques and easily processed by a one-way quantum computer.
基金supported by the National Natural Science Foundation of China(61673060)the National Key R&D Plan(2016YFB0501700)
文摘Gravity-aided inertial navigation is a hot issue in the applications of underwater autonomous vehicle(UAV). Since the matching process is conducted with a gravity anomaly database tabulated in the form of a digital model and the resolution is 2’ × 2’,a filter model based on vehicle position is derived and the particularity of inertial navigation system(INS) output is employed to estimate a parameter in the system model. Meanwhile, the matching algorithm based on point mass filter(PMF) is applied and several optimal selection strategies are discussed. It is obtained that the point mass filter algorithm based on the deterministic resampling method has better practicability. The reliability and the accuracy of the algorithm are verified via simulation tests.
基金This project is supported by Key Science-Technology Project of Shanghai City Tenth Five-Year-Plan, China (No.031111002)Specialized Research Fund for the Doctoral Program of Higher Education, China (No.20040247033)Municipal Key Basic Research Program of Shanghai, China (No.05JC14060)
文摘In response to the production capacity and functionality variations, a genetic algorithm (GA) embedded with deterministic timed Petri nets(DTPN) for reconfigurable production line(RPL) is proposed to solve its scheduling problem. The basic DTPN modules are presented to model the corresponding variable structures in RPL, and then the scheduling model of the whole RPL is constructed. And in the scheduling algorithm, firing sequences of the Petri nets model are used as chromosomes, thus the selection, crossover, and mutation operator do not deal with the elements in the problem space, but the elements of Petri nets model. Accordingly, all the algorithms for GA operations embedded with Petri nets model are proposed. Moreover, the new weighted single-objective optimization based on reconfiguration cost and E/T is used. The results of a DC motor RPL scheduling suggest that the presented DTPN-GA scheduling algorithm has a significant impact on RPL scheduling, and provide obvious improvements over the conventional scheduling method in practice that meets duedate, minimizes reconfiguration cost, and enhances cost effectivity.
基金This paper is supported by the National Natural Science Foundation of China under Grant No. 40005006, the Knowledge Innovation Key Project of the Chinese Academy of Sciences in the Resource Environment Field (No. KZCX2-203), and the National Key Programm
文摘The Pacific decadal and interdecadal oscillation (PDO) has been extensively explored in recent decades because of its profound impact on global climate systems. It is a long-lived ENSO-like pattern of Pacific climate variability with a period of 10-30 years. The general picture is that the anomalously warm (cool) SSTs in the central North Pacific are always accompanied by the anomalously cool (warm) SSTs along the west coast of America and in the central east tropical Pacific with comparable amplitude. In general, there are two classes of opinions on the origin of this low-frequency climate variability, one thinking that it results from deterministically coupled modes of the Pacific ocean-atmosphere system, and the other, from stochastic atmospheric forcing. The deterministic origin emphasizes that the internal physical processes in an air-sea system can provide a positive feedback mechanism to amplify an initial perturbation, and a negative feedback mechanism to reverse the phase of oscillation. The dynamic evolution of ocean circulation determines the timescale of the oscillation. The stochastic origin, however, emphasizes that because the atmospheric activities can be thought as having no preferred timescale and are associated with an essentially white noise spectrum, tne ocean response can manifest a red peak in a certain low frequency range with a decadal to interdecadal timescale. In this paper, the authors try to systematically understand the state of the art of observational, theoretical and numerical studies on the PDO and hope to provide a useful background reference for current research.
基金supported by the National Science Council of the Republic of China(Grant No.NSC 98-2221-E-006-097-MY3)
文摘This study proposes a new coding function for the symmetric W state. Based on the new coding function, a theoretical protocol of deterministic quanama communication (DQC) is proposed. The sender can use the proposed coding function to encode his/her message, and the receiver can perform the imperfect Bell measurement to obtain the sender's message. In comparison to the existing DQC protocols that also use the W class state, the proposed protocol is more efficient and also more practical within today's technology. Moreover, the security of this protocol is analyzed to show that any eavesdropper will be detected with a very high probability under both the ideal and the noisy quantum channel.
基金This work was supported in part by the US Department of Energy(DOE),Office of Electricity and Office of Energy Efficiency and Renewable Energy under contract DE-AC05-00OR22725in part by CURENT,an Engineering Research Center funded by US National Science Foundation(NSF)and DOE under NSF award EEC-1041877in part by NSF award ECCS-1809458.
文摘In this paper,a day-ahead electricity market bidding problem with multiple strategic generation company(GEN-CO)bidders is studied.The problem is formulated as a Markov game model,where GENCO bidders interact with each other to develop their optimal day-ahead bidding strategies.Considering unobservable information in the problem,a model-free and data-driven approach,known as multi-agent deep deterministic policy gradient(MADDPG),is applied for approximating the Nash equilibrium(NE)in the above Markov game.The MAD-DPG algorithm has the advantage of generalization due to the automatic feature extraction ability of the deep neural networks.The algorithm is tested on an IEEE 30-bus system with three competitive GENCO bidders in both an uncongested case and a congested case.Comparisons with a truthful bidding strategy and state-of-the-art deep reinforcement learning methods including deep Q network and deep deterministic policy gradient(DDPG)demonstrate that the applied MADDPG algorithm can find a superior bidding strategy for all the market participants with increased profit gains.In addition,the comparison with a conventional-model-based method shows that the MADDPG algorithm has higher computational efficiency,which is feasible for real-world applications.
基金This work is partially supported by the National Natural Science Foundation of China(Nos.61833013 and 62003162)the Natural Science Foundation of Jiangsu Province of China(No.BK20200416)the China Postdoctoral Science Foundation(Nos.2020TQ0151 and 2020M681590)and the Natural Sciences and Engineering Research Council of Canada.
文摘Unmanned aerial vehicles(UAVs)have been extensively used in civil and industrial applications due to the rapid development of the guidance,navigation and control(GNC)technologies.Especially,using deep reinforcement learning methods for motion control acquires a major progress recently,since deep Q-learning algorithm has been successfully applied to the continuous action domain problem.This paper proposes an improved deep deterministic policy gradient(DDPG)algorithm for path following control problem of UAV.A speci-c reward function is designed for minimizing the cross-track error of the path following problem.In the training phase,a double experience replay bu®er(DERB)is used to increase the learning e±ciency and accelerate the convergence speed.First,the model of UAV path following problem has been established.After that,the framework of DDPG algorithm is constructed.Then the state space,action space and reward function of the UAV path following algorithm are designed.DERB is proposed to accelerate the training phase.Finally,simulation results are carried out to show the e®ectiveness of the proposed DERB–DDPG method.