用于在贝叶斯深神经网络中启用1000多个蒙特卡洛迭代的图形重新分配

论文标题

用于在贝叶斯深神经网络中启用1000多个蒙特卡洛迭代的图形重新分配

Graph Reparameterizations for Enabling 1000+ Monte Carlo Iterations in Bayesian Deep Neural Networks

论文作者

Nazarovs, Jurijs, Mehta, Ronak R., Lokhande, Vishnu Suresh, Singh, Vikas

论文摘要

在许多实际应用中，深层模型中的不确定性估计至关重要，并且在过去几年中从发展中受益。最近的证据表明，依赖于简单高斯配方的现有解决方案可能不够。但是，转移到其他分布需要蒙特卡洛（MC）采样来估计数量，例如kl差异：它可能很昂贵，并且随着输入数据和模型的尺寸增长而缩放效果很差。这与计算图的结构直接相关，该结构可以根据所需的MC样本数量线性生长。在这里，我们构建了一个框架来描述这些计算图，并确定图形大小可以独立或仅弱取决于MC样本数量的概率系列。这些家庭直接对应于大型分布。从经验上讲，我们可以对MC近似进行大量迭代，以供计算机视觉中使用的较大体系结构，并以自信的准确性，训练的稳定性，记忆力和训练时间来衡量性能的提高。

Uncertainty estimation in deep models is essential in many real-world applications and has benefited from developments over the last several years. Recent evidence suggests that existing solutions dependent on simple Gaussian formulations may not be sufficient. However, moving to other distributions necessitates Monte Carlo (MC) sampling to estimate quantities such as the KL divergence: it could be expensive and scales poorly as the dimensions of both the input data and the model grow. This is directly related to the structure of the computation graph, which can grow linearly as a function of the number of MC samples needed. Here, we construct a framework to describe these computation graphs, and identify probability families where the graph size can be independent or only weakly dependent on the number of MC samples. These families correspond directly to large classes of distributions. Empirically, we can run a much larger number of iterations for MC approximations for larger architectures used in computer vision with gains in performance measured in confident accuracy, stability of training, memory and training time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题