论文标题
通过深入加强学习和注意力机制的动态工作购物店调度的混合智能
Hybrid intelligence for dynamic job-shop scheduling with deep reinforcement learning and attention mechanism
论文作者
论文摘要
动态工作店调度问题(DJSP)是一类调度任务,这些任务专门考虑固有的不确定性,例如更改订单要求和现实智能制造设置中可能的机器故障。由于传统方法无法在面对环境的干扰时动态生成有效的调度策略,因此我们将DJSP作为马尔可夫决策过程(MDP)制定,以通过增强学习(RL)来解决。为此,我们提出了一个灵活的混合框架,该框架将脱节图作为状态和一组一般派遣规则作为具有最低先前域知识的动作空间。注意机制用作状态特征提取的图形表示学习(GRL)模块,并且使用优先重播和嘈杂的网络(D3QPN)的双决斗深Q网络将每个状态映射到最合适的调度规则。此外,我们提出了基于著名的或著名的公共基准GymJSP,为RL和DJSP研究社区提供标准化的现成设施。在各种DJSP实例上进行的全面实验证实,在所有实例中,我们提出的框架优于基线算法,并且在所有实例中都较小,并为混合框架中各个组件的有效性提供了经验理由。
The dynamic job-shop scheduling problem (DJSP) is a class of scheduling tasks that specifically consider the inherent uncertainties such as changing order requirements and possible machine breakdown in realistic smart manufacturing settings. Since traditional methods cannot dynamically generate effective scheduling strategies in face of the disturbance of environments, we formulate the DJSP as a Markov decision process (MDP) to be tackled by reinforcement learning (RL). For this purpose, we propose a flexible hybrid framework that takes disjunctive graphs as states and a set of general dispatching rules as the action space with minimum prior domain knowledge. The attention mechanism is used as the graph representation learning (GRL) module for the feature extraction of states, and the double dueling deep Q-network with prioritized replay and noisy networks (D3QPN) is employed to map each state to the most appropriate dispatching rule. Furthermore, we present Gymjsp, a public benchmark based on the well-known OR-Library, to provide a standardized off-the-shelf facility for RL and DJSP research communities. Comprehensive experiments on various DJSP instances confirm that our proposed framework is superior to baseline algorithms with smaller makespan across all instances and provide empirical justification for the validity of the various components in the hybrid framework.