论文标题

使用深厚的增强学习模拟多EXIT疏散

Simulating multi-exit evacuation using deep reinforcement learning

论文作者

Xu, Dong, Huang, Xiao, Mango, Joseph, Li, Xiang, Li, Zhenlong

论文摘要

对多室内室内疏散的常规仿真主要集中于如何根据不断变化的环境中的许多因素来确定合理退出。结果通常包括一些充血和其他未充分利用的出口,尤其是在大量的行人中。我们提出了一个基于深入增强学习(DRL)的多EXIT疏散模拟,称为多XIT-DRL,该模拟涉及深度神经网络(DNN)框架以促进状态到行动映射。 DNN框架应用了Rainbow Deep Q-Network(DQN),该算法是一种集成了几种高级DQN方法的DRL算法,以改善数据利用率和算法稳定性,并将动作空间进一步分为八个等轴测方向,以实现可能的人行人选择。我们将多XIT-DRL与两个常规的多EXIT撤离模拟模型进行比较:1)改变行人分布比,2)不同的出口宽度比和3)更改出口的开放时间表。结果表明,Multiexit-DRL提出了出色的学习效率,同时减少了所有设计实验中的撤离框架总数。此外,DRL的整合使行人能够探索其他潜在的出口并帮助确定最佳方向,从而导致出口高效率。

Conventional simulations on multi-exit indoor evacuation focus primarily on how to determine a reasonable exit based on numerous factors in a changing environment. Results commonly include some congested and other under-utilized exits, especially with massive pedestrians. We propose a multi-exit evacuation simulation based on Deep Reinforcement Learning (DRL), referred to as the MultiExit-DRL, which involves in a Deep Neural Network (DNN) framework to facilitate state-to-action mapping. The DNN framework applies Rainbow Deep Q-Network (DQN), a DRL algorithm that integrates several advanced DQN methods, to improve data utilization and algorithm stability, and further divides the action space into eight isometric directions for possible pedestrian choices. We compare MultiExit-DRL with two conventional multi-exit evacuation simulation models in three separate scenarios: 1) varying pedestrian distribution ratios, 2) varying exit width ratios, and 3) varying open schedules for an exit. The results show that MultiExit-DRL presents great learning efficiency while reducing the total number of evacuation frames in all designed experiments. In addition, the integration of DRL allows pedestrians to explore other potential exits and helps determine optimal directions, leading to the high efficiency of exit utilization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源