论文标题
学会安排启发式方法,以同时对采矿配合物进行随机优化
Learning to Schedule Heuristics for the Simultaneous Stochastic Optimization of Mining Complexes
论文作者
论文摘要
采矿配合物(SSOMC)的同时随机优化是一个大规模的随机组合优化问题,同时管理从多个矿山中提取材料及其使用相互联系的设施来生成一组最终产品的处理,同时考虑到材料供应(地理),以管理相关风险。尽管已显示模拟退火表现优于求解SSOMC的方法,但早期性能可能会主导最近的性能,因为启发式方法的组合用于确定要应用哪些扰动。这项工作提出了一个以完全自我管理的超级武器来解决SSOMC的数据驱动框架,用于启发式调度。拟议的学习 - 托盘(L2P)超高式高神疗法是一种多个纽伯特的模拟退火算法。 L2P选择使用强化学习以自适应方式应用的启发式(扰动),以有效地探索哪种本地搜索最适合特定搜索点。已将一些最先进的代理商纳入L2P中,以更好地调整搜索并将其引导到更好的解决方案中。通过从描述启发式方法的性能的数据中学习,可以更快地找到更好的启发式方法,从而更快地发现了更好的解决方案。 L2P对几个现实世界采矿配合物进行了测试,并着重于效率,鲁棒性和概括能力。结果表明,迭代次数的数量减少了30-50%,计算时间减少了30-45%。
The simultaneous stochastic optimization of mining complexes (SSOMC) is a large-scale stochastic combinatorial optimization problem that simultaneously manages the extraction of materials from multiple mines and their processing using interconnected facilities to generate a set of final products, while taking into account material supply (geological) uncertainty to manage the associated risk. Although simulated annealing has been shown to outperform comparing methods for solving the SSOMC, early performance might dominate recent performance in that a combination of the heuristics' performance is used to determine which perturbations to apply. This work proposes a data-driven framework for heuristic scheduling in a fully self-managed hyper-heuristic to solve the SSOMC. The proposed learn-to-perturb (L2P) hyper-heuristic is a multi-neighborhood simulated annealing algorithm. The L2P selects the heuristic (perturbation) to be applied in a self-adaptive manner using reinforcement learning to efficiently explore which local search is best suited for a particular search point. Several state-of-the-art agents have been incorporated into L2P to better adapt the search and guide it towards better solutions. By learning from data describing the performance of the heuristics, a problem-specific ordering of heuristics that collectively finds better solutions faster is obtained. L2P is tested on several real-world mining complexes, with an emphasis on efficiency, robustness, and generalization capacity. Results show a reduction in the number of iterations by 30-50% and in the computational time by 30-45%.