与深度强化学习的同时进行蜂窝连接的无人机的导航和无线电映射

论文标题

与深度强化学习的同时进行蜂窝连接的无人机的导航和无线电映射

Simultaneous Navigation and Radio Mapping for Cellular-Connected UAV with Deep Reinforcement Learning

论文作者

Zeng, Yong, Xu, Xiaoli, Jin, Shi, Zhang, Rui

论文摘要

蜂窝连接的无人机（UAV）是一项有前途的技术，可以在将来释放无人机的全部潜力。但是，如何实现无处不在的无人机覆盖天空中的三维（3D）通信覆盖范围是一个新的挑战。在本文中，我们通过一种新的覆盖范围导航方法来应对这一挑战，该方法利用了无人机的可控移动性设计其导航/轨迹，以避免在完成任务时蜂窝BSS的覆盖孔。我们制定了一个无人机轨迹优化问题，以最大程度地减少其任务完成时间的加权总和和预期的沟通中断持续时间，并根据深度强化学习技术（DRL）提出了一种新的解决方案方法。为了进一步提高性能，我们提出了一个称为同时导航和无线电映射（SNARM）的新框架，其中无人机的信号测量不仅用于直接训练深Q网络（DQN），而且还可以创建一个能够预测感兴趣领域中所有位置的中断症状的无线电图。因此，这可以产生模拟的无人机轨迹并预测其预期回报，然后将其用于通过DYNA技术进一步训练DQN，从而大大提高了学习效率。

Cellular-connected unmanned aerial vehicle (UAV) is a promising technology to unlock the full potential of UAVs in the future. However, how to achieve ubiquitous three-dimensional (3D) communication coverage for the UAVs in the sky is a new challenge. In this paper, we tackle this challenge by a new coverage-aware navigation approach, which exploits the UAV's controllable mobility to design its navigation/trajectory to avoid the cellular BSs' coverage holes while accomplishing their missions. We formulate an UAV trajectory optimization problem to minimize the weighted sum of its mission completion time and expected communication outage duration, and propose a new solution approach based on the technique of deep reinforcement learning (DRL). To further improve the performance, we propose a new framework called simultaneous navigation and radio mapping (SNARM), where the UAV's signal measurement is used not only for training the deep Q network (DQN) directly, but also to create a radio map that is able to predict the outage probabilities at all locations in the area of interest. This thus enables the generation of simulated UAV trajectories and predicting their expected returns, which are then used to further train the DQN via Dyna technique, thus greatly improving the learning efficiency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题