论文标题

通过参数范围降低的元学习机器人技术的贝叶斯优化

Bayesian Optimization for Developmental Robotics with Meta-Learning by Parameters Bounds Reduction

论文作者

Petit, Maxime, Dellandrea, Emmanuel, Chen, Liming

论文摘要

在机器人技术中,方法和软件通常需要对超参数进行优化,以便有效地适合特定任务,例如从不同对象的均匀堆中挑选工业bin。我们提出了一个基于长期记忆和推理模块(贝叶斯优化,视觉相似性和参数降低)的发展框架,允许机器人使用元学习机制,从而提高了这种连续和约束参数优化的效率。 The new optimization, viewed as a learning for the robot, can take advantage of past experiences (stored in the episodic and procedural memories) to shrink the search space by using reduced parameters bounds computed from the best optimizations realized by the robot with similar tasks of the new one (e.g. bin-picking from an homogenous heap of a similar object, based on visual similarity of objects stored in the semantic memory).例如,我们已经面对了该系统在工业机器人手臂键入任务中的9个连续超参数(KAMIDO)的限制优化,这是每次正确处理新对象所需的步骤。我们使用一个模拟器来为8种不同的物体创建bin选择任务(模拟中的7个,一个具有真实设置,没有元学习,并具有来自其他类似物体的经验),尽管预算非常小,并且在使用元学习的情况下达到了较小的优化预算,但在每个元数据中都可以实现较小的性能,每个元素的成功率为78.9%,而每次预算均具有30次预算,并且均具有30次预算,该预算均具有30个ITERAS,该预算均具有30个ITERIS,该预算均具有30个ITERANE,该预算是30个ITERES,该预算均具有30个ITERANE,该预算是30个ITERES,该预算均具有30个ITERANE,该预算均具有30个ITERERINGE,该预算均具有30个ITERANE的效果。 (p值= 0.036)。

In robotics, methods and softwares usually require optimizations of hyperparameters in order to be efficient for specific tasks, for instance industrial bin-picking from homogeneous heaps of different objects. We present a developmental framework based on long-term memory and reasoning modules (Bayesian Optimisation, visual similarity and parameters bounds reduction) allowing a robot to use meta-learning mechanism increasing the efficiency of such continuous and constrained parameters optimizations. The new optimization, viewed as a learning for the robot, can take advantage of past experiences (stored in the episodic and procedural memories) to shrink the search space by using reduced parameters bounds computed from the best optimizations realized by the robot with similar tasks of the new one (e.g. bin-picking from an homogenous heap of a similar object, based on visual similarity of objects stored in the semantic memory). As example, we have confronted the system to the constrained optimizations of 9 continuous hyperparameters for a professional software (Kamido) in industrial robotic arm bin-picking tasks, a step that is needed each time to handle correctly new object. We used a simulator to create bin-picking tasks for 8 different objects (7 in simulation and one with real setup, without and with meta-learning with experiences coming from other similar objects) achieving goods results despite a very small optimization budget, with a better performance reached when meta-learning is used (84.3% vs 78.9% of success overall, with a small budget of 30 iterations for each optimization) for every object tested (p-value=0.036).

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源