论文标题
重量:通过重量对齐来归一化激活
WeightAlign: Normalizing Activations by Weight Alignment
论文作者
论文摘要
批次归一化(BN)允许通过通过微型批量样本统计量进行激活来训练非常深的网络,从而使小批量大小的BN不稳定。当前的小批量解决方案,例如实例规范,层规范和组规范统计量,即使对于单个样本也可以计算。此类方法比BN稳定,因为它们严重取决于单个输入样本的统计数据。为了解决这个问题,我们提出了无需样本统计的激活的归一化。我们介绍重量:一种通过在过滤器中计算的平均值和缩放标准推导来使权重归一化的方法,该方法在无需计算任何样本统计数据的情况下将激活归一化。我们提出的方法独立于批处理大小,并且在各种批处理范围内稳定。由于体重统计与样本统计是正交的,因此我们可以将重量直接与任何激活归一化的方法结合在一起。我们在实验上证明了在CIFAR-10,CIFAR-100,Imagenet上分类的这些好处,用于Pascal VOC 2012上的语义分割以及Office-31的域适应性。
Batch normalization (BN) allows training very deep networks by normalizing activations by mini-batch sample statistics which renders BN unstable for small batch sizes. Current small-batch solutions such as Instance Norm, Layer Norm, and Group Norm use channel statistics which can be computed even for a single sample. Such methods are less stable than BN as they critically depend on the statistics of a single input sample. To address this problem, we propose a normalization of activation without sample statistics. We present WeightAlign: a method that normalizes the weights by the mean and scaled standard derivation computed within a filter, which normalizes activations without computing any sample statistics. Our proposed method is independent of batch size and stable over a wide range of batch sizes. Because weight statistics are orthogonal to sample statistics, we can directly combine WeightAlign with any method for activation normalization. We experimentally demonstrate these benefits for classification on CIFAR-10, CIFAR-100, ImageNet, for semantic segmentation on PASCAL VOC 2012 and for domain adaptation on Office-31.