论文标题
强化学习以可证明的规范合规性为指导
Reinforcement Learning Guided by Provable Normative Compliance
论文作者
论文摘要
强化学习(RL)已表现出承诺作为自主代理中安全,道德或法律行为的工具。它的使用通常依赖于对构成不安全或不道德选择的州行动对分配惩罚。尽管这项任务是这种方法的关键一步,但是关于概括选择惩罚和决定在哪里应用过程的过程的讨论有限。在本文中,我们采用了一种在培训期间利用现有框架(Neufeld等,2021)的规范主管的方法。该规范主管用于动态转化状态,并将适用的规范系统转化为可不可避免的义逻辑理论,将这些理论馈送到定理供体,并使用得出的结论来决定是否将惩罚分配给代理。我们使用多目标RL(MORL)来平衡避免具有非伦理目标的违规行为的道德目标;我们将证明我们的方法对多种摩尔技术有效,并表明,无论我们分配的惩罚程度如何,它都是有效的。
Reinforcement learning (RL) has shown promise as a tool for engineering safe, ethical, or legal behaviour in autonomous agents. Its use typically relies on assigning punishments to state-action pairs that constitute unsafe or unethical choices. Despite this assignment being a crucial step in this approach, however, there has been limited discussion on generalizing the process of selecting punishments and deciding where to apply them. In this paper, we adopt an approach that leverages an existing framework -- the normative supervisor of (Neufeld et al., 2021) -- during training. This normative supervisor is used to dynamically translate states and the applicable normative system into defeasible deontic logic theories, feed these theories to a theorem prover, and use the conclusions derived to decide whether or not to assign a punishment to the agent. We use multi-objective RL (MORL) to balance the ethical objective of avoiding violations with a non-ethical objective; we will demonstrate that our approach works for a multiplicity of MORL techniques, and show that it is effective regardless of the magnitude of the punishment we assign.