摘要
在人机共享自主中,人和智能机器以互补的能力共同完成实时控制任务,以实现双方单独控制无法达到的性能.现有的许多人机共享自主方法倾向于假设人的决策始终“有效”,即这些决策促进了任务的完成,且有效地反映了人类的真实意图然而,在现实中,由于疲劳、分心等多种原因,人的决策会在一定程度上“无效”,不满足这些方法的基本假设,导致方法失效,进而导致任务失败本文提出了一种新的基于深度强化学习的人机共享自主方法,使系统能够在人类决策长期无效的情况下完成正确的目标.具体来说,我们使用深度强化学习训练从系统状态和人类决策到决策价值的端到端映射,以显式判断人类决策是否无效.如果无效,机器将接管系统以获得更好的性能.我们将该方法应用于实时控制任务中,结果表明该方法能够及时、准确地判断人类决策的有效性,分配相应的控制权限,并最终提高了系统性能.
In shared autonomy, humans and intelligent robots jointly complete real-time control tasks with their complementary capabilities for improved performance unattainable by either side independently. Many existing methods tend to assume that human decisions are “effective”, i.e., these decisions promote task completion and effectively reflect the true human intention. However, in reality, human decisions can often be “ineffective” to a certain extent due to many reasons, such as fatigue or inattentiveness, which leads to task failure. In this work,we propose a novel deep reinforcement learning-based shared autonomy strategy for human-machine systems,so that the system can complete the correct goal even when human decisions are ineffective for a long period.Specifically, we use deep reinforcement learning to train an end-to-end mapping from system states and human decisions to the value of decisions to explicitly judge whether the human decisions are ineffective. If they are ineffective, the robot takes over the system for better performance. We apply our method to real-time control tasks, and the results show that it can timely and accurately judge the effectiveness of human decisions, allocate control authority, and ultimately improve system performance.
作者
游诗艺
康宇
赵云波
张倩倩
Shiyi YOU;Yu KANG;Yun-Bo ZHAO;Qianqian ZHANG(Department of Automation,University of Science and Technology of China,Hefei 230026,China;State Key Laboratory of Fire Science,University of Science and Technology of China,Hefei 230026,China;Institute of Aduanced Technology,University of Science and Technology of China,Hefei 230088,China;Institute of Artificial Intelligence,Hefei Comprehensive National Science Center,Hefei 230026,China;School of Artificial Inteligence,Anhui University,Hefei 230026,China)
出处
《中国科学:信息科学》
CSCD
北大核心
2022年第12期2165-2177,共13页
Scientia Sinica(Informationis)
基金
科技创新2030―“新一代人工智能”重大专项(批准号:2018AAA0100800)资助项目。
关键词
人机系统
共享自主
非全时有效决策
深度强化学习
仲裁
human-machine system
shared autonomy
non-full-time effective decision
deep reinforcement learn-ing
arbitration