通过强化学习动态的时间和解

论文标题

通过强化学习动态的时间和解

Dynamic Temporal Reconciliation by Reinforcement learning

论文作者

Charotia, Himanshi, Garg, Abhishek, Dhama, Gaurav, Maheshwari, Naman

论文摘要

基于长期和短期时间序列预测的计划是许多行业的普遍做法。在这种情况下，时间聚集和对帐技术在改善预测，降低模型不确定性以及在不同时间范围内提供相干的预测很有用。但是，跨越所有这些技术的基本假设是所有时间层次结构的数据的完整可用性，而这提供了数学便利性，但是大多数时间低频数据都是部分完成的，并且在预测时不可用。另一方面，在诸如Covid大流行之类的情况下，高频数据可以显着变化，并且可以使用这种变化来改善预测，否则这些预测将与长期实际情况显着不同。我们提出了一种动态核对方法，通过该方法，我们制定了基于高频实际情况作为马尔可夫决策过程（MDP）告知低频预测的问题，允许我们没有有关该过程动态的完整信息。这使我们可以根据最新数据获得最佳的长期估计，即使低频周期仅部分完成。与仅依靠历史低频数据相比，使用可自定义的动作的时间差的增强学习方法（TDRL）方法已通过时间差的增强学习（TDRL）方法来解决，并改善了长期预测。结果还强调了一个事实，即尽管低频预测可以改善时间对帐文献中提到的高频预测（基于低频预测的假设，即低频预测的信号比）也可以使用高频预测来告知低频预测。

Planning based on long and short term time series forecasts is a common practice across many industries. In this context, temporal aggregation and reconciliation techniques have been useful in improving forecasts, reducing model uncertainty, and providing a coherent forecast across different time horizons. However, an underlying assumption spanning all these techniques is the complete availability of data across all levels of the temporal hierarchy, while this offers mathematical convenience but most of the time low frequency data is partially completed and it is not available while forecasting. On the other hand, high frequency data can significantly change in a scenario like the COVID pandemic and this change can be used to improve forecasts that will otherwise significantly diverge from long term actuals. We propose a dynamic reconciliation method whereby we formulate the problem of informing low frequency forecasts based on high frequency actuals as a Markov Decision Process (MDP) allowing for the fact that we do not have complete information about the dynamics of the process. This allows us to have the best long term estimates based on the most recent data available even if the low frequency cycles have only been partially completed. The MDP has been solved using a Time Differenced Reinforcement learning (TDRL) approach with customizable actions and improves the long terms forecasts dramatically as compared to relying solely on historical low frequency data. The result also underscores the fact that while low frequency forecasts can improve the high frequency forecasts as mentioned in the temporal reconciliation literature (based on the assumption that low frequency forecasts have lower noise to signal ratio) the high frequency forecasts can also be used to inform the low frequency forecasts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题