论文标题

道德:通过多目标增强活性学习使AI与人类规范保持一致

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

论文作者

Peschl, Markus, Zgonnikov, Arkady, Oliehoek, Frans A., Siebert, Luciano C.

论文摘要

从演示和成对的偏好中推断出奖励功能是使加固学习(RL)与人类意图相结合(RL)的吉祥方法。但是,最先进的方法通常专注于学习单个奖励模型,从而使多个专家的不同奖励功能很难进行。我们提出了多目标增强积极学习(道德),这是一种将社会规范各种演示结合到帕累托最佳政策中的新方法。通过维持标量重量的分布,我们的方法能够将深入的RL代理与各种偏好进行交互调整,同时消除了计算多个策略的需求。我们从经验上证明了道德在两种情况下的有效性,这些情况模拟了一项交付和紧急任务,要求代理在存在规范冲突的情况下采取行动。总体而言,我们认为我们的研究是迈向多目标RL的一步,并获得了学习的奖励,从而弥合了当前的奖励学习与机器道德文献之间的差距。

Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions. However, state-of-the art methods typically focus on learning a single reward model, thus rendering it difficult to trade off different reward functions from multiple experts. We propose Multi-Objective Reinforced Active Learning (MORAL), a novel method for combining diverse demonstrations of social norms into a Pareto-optimal policy. Through maintaining a distribution over scalarization weights, our approach is able to interactively tune a deep RL agent towards a variety of preferences, while eliminating the need for computing multiple policies. We empirically demonstrate the effectiveness of MORAL in two scenarios, which model a delivery and an emergency task that require an agent to act in the presence of normative conflicts. Overall, we consider our research a step towards multi-objective RL with learned rewards, bridging the gap between current reward learning and machine ethics literature.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源