论文标题
所有视觉模型是否相等?开环至闭环因果关系差距的研究
Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap
论文作者
论文摘要
现代神经网络模型有一个不断增长的动物园,可以从视觉观察中有效地学习端到端控制。这些先进的深层模型,从卷积到基于补丁的网络,已经在离线图像分类和回归任务上进行了广泛的测试。在本文中,我们研究了有关开环的这些视力体系结构,即闭环因果关系差距,即离线培训,然后进行在线闭环部署。这种因果关系差距通常出现在机器人技术应用中,例如自动驾驶,在该应用程序中,网络经过训练以模仿人类的控制命令。在这种情况下,出现了两种情况:1)闭环测试分布,测试环境与离线培训数据共享属性。 2)在分配偏移和分布情况下进行闭环测试。与最近报道的结果相反,我们表明,在适当的培训指南下,所有视力模型在分布部署方面的表现都没有分辨,从而解决了因果关系差距。在情况2中,我们观察到,无论模型架构的选择如何,因果关系差距都会破坏性能。我们的结果表明,通过我们提出的任何现代网络体系结构,可以通过我们提出的培训指南来解决因果关系差距,同时实现分布外的概括(情况二)需要进一步研究,例如,关于数据多样性而不是模型架构。
There is an ever-growing zoo of modern neural network models that can efficiently learn end-to-end control from visual observations. These advanced deep models, ranging from convolutional to patch-based networks, have been extensively tested on offline image classification and regression tasks. In this paper, we study these vision architectures with respect to the open-loop to closed-loop causality gap, i.e., offline training followed by an online closed-loop deployment. This causality gap typically emerges in robotics applications such as autonomous driving, where a network is trained to imitate the control commands of a human. In this setting, two situations arise: 1) Closed-loop testing in-distribution, where the test environment shares properties with those of offline training data. 2) Closed-loop testing under distribution shifts and out-of-distribution. Contrary to recently reported results, we show that under proper training guidelines, all vision models perform indistinguishably well on in-distribution deployment, resolving the causality gap. In situation 2, We observe that the causality gap disrupts performance regardless of the choice of the model architecture. Our results imply that the causality gap can be solved in situation one with our proposed training guideline with any modern network architecture, whereas achieving out-of-distribution generalization (situation two) requires further investigations, for instance, on data diversity rather than the model architecture.