基于NOMA基于NOMA的随机访问网络具有截短的频道反转功率控制的基于深度强化学习的方法

论文标题

基于NOMA基于NOMA的随机访问网络具有截短的频道反转功率控制的基于深度强化学习的方法

A Deep Reinforcement Learning based Approach for NOMA-based Random Access Network with Truncated Channel Inversion Power Control

论文作者

Chen, Ziru, Zhang, Ran, Cai, Lin X., Cheng, Yu, Liu, Yong

论文摘要

作为5G和无线网络的主要用例，近年来，不断增加的机器类型通信（MTC）设备对MTC网络提出了关键的挑战。必须支持资源有限的大型MTC设备。为此，基于非正交的多重访问（NOMA）随机访问网络已被视为MTC网络的前瞻性候选者。在本文中，我们提出了一种基于截短的通道反转功率控制的基于NOMA的随机访问网络的深入加固学习（RL）方法。具体而言，每个MTC设备都随机选择具有数据传输概率的预定义功率水平。设备正在使用通道反转功率控制，但要受传输功率的上限。由于通道褪色的随机特征和有限的传输功率，具有不同可实现功率水平的设备已被归类为不同类型的设备。为了考虑所有设备之间的公平性，要实现高吞吐量，则制定了两个目标功能。一种是最大化所有MTC设备的最小长期预期吞吐量，另一个是最大化所有MTC设备的长期预期吞吐量的几何平均值。进一步采用了基于政策的深度强化学习方法来调整每个设备的传输概率以解决公式的优化问题。进行了广泛的模拟以显示我们提出的方法的优点。

As a main use case of 5G and Beyond wireless network, the ever-increasing machine type communications (MTC) devices pose critical challenges over MTC network in recent years. It is imperative to support massive MTC devices with limited resources. To this end, Non-orthogonal multiple access (NOMA) based random access network has been deemed as a prospective candidate for MTC network. In this paper, we propose a deep reinforcement learning (RL) based approach for NOMA-based random access network with truncated channel inversion power control. Specifically, each MTC device randomly selects a pre-defined power level with a certain probability for data transmission. Devices are using channel inversion power control yet subject to the upper bound of the transmission power. Due to the stochastic feature of the channel fading and the limited transmission power, devices with different achievable power levels have been categorized as different types of devices. In order to achieve high throughput with considering the fairness between all devices, two objective functions are formulated. One is to maximize the minimum long-term expected throughput of all MTC devices, the other is to maximize the geometric mean of the long-term expected throughput for all MTC devices. A Policy based deep reinforcement learning approach is further applied to tune the transmission probabilities of each device to solve the formulated optimization problems. Extensive simulations are conducted to show the merits of our proposed approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题