论文标题

愚蠢的规则提高了代理商学习稳定执法和合规行为的能力

Silly rules improve the capacity of agents to learn stable enforcement and compliance behaviors

论文作者

Köster, Raphael, Hadfield-Menell, Dylan, Hadfield, Gillian K., Leibo, Joel Z.

论文摘要

社会如何学会执行并遵守社会规范?在这里,我们在觅食游戏中调查了合规性的学习动力和合规性和社会规范的执行,该游戏在多代理强化学习环境中实施。在这个时空扩展的游戏中,激励个人来实施复杂的浆果造成政策,并对涵盖特定浆果类型的社会禁忌进行惩罚。我们表明,当食用有毒浆果时,特工受益是禁忌,这意味着这种行为受到其他代理的惩罚,因为这有助于克服信用式分配问题,从而在发现延迟的健康效果时。然而,至关重要的是,我们还表明,引入了一个额外的禁忌,这会导致对食用无害的浆果的惩罚,这提高了代理商学会惩罚禁忌行为并遵守禁忌的速度和稳定性。违反直觉,我们的结果表明,任意禁忌(“愚蠢的规则”)可以增强社会学习动态并在学习的中间阶段获得更好的结果。我们在研究规范性的背景下讨论结果,作为群体级别的新兴现象。

How can societies learn to enforce and comply with social norms? Here we investigate the learning dynamics and emergence of compliance and enforcement of social norms in a foraging game, implemented in a multi-agent reinforcement learning setting. In this spatiotemporally extended game, individuals are incentivized to implement complex berry-foraging policies and punish transgressions against social taboos covering specific berry types. We show that agents benefit when eating poisonous berries is taboo, meaning the behavior is punished by other agents, as this helps overcome a credit-assignment problem in discovering delayed health effects. Critically, however, we also show that introducing an additional taboo, which results in punishment for eating a harmless berry, improves the rate and stability with which agents learn to punish taboo violations and comply with taboos. Counterintuitively, our results show that an arbitrary taboo (a "silly rule") can enhance social learning dynamics and achieve better outcomes in the middle stages of learning. We discuss the results in the context of studying normativity as a group-level emergent phenomenon.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源