论文标题
2022年的视觉变压器:小型成像网的更新
Vision Transformers in 2022: An Update on Tiny ImageNet
论文作者
论文摘要
图像变压器的最新进展显示出令人印象深刻的结果,并在很大程度上缩小了传统CNN体系结构之间的差距。标准过程是在Imagenet-21K等大型数据集上训练,然后在Imagenet-1K上进行芬特训练。经过填充后,研究通常会考虑在较小的数据集(例如CIFAR-10/100)上的转移学习性能,但遗漏了Tiny Imagenet。本文提供了有关Vision Transformers在Tiny Imagenet上的性能的最新信息。我包括视觉变压器(VIT),数据有效的图像变压器(DEIT),图像变压器(CAIT)和SWIN Transformers中的类关注。此外,Swin Transformers以91.35%的验证精度击败了当前的最新结果。代码可在此处找到:https://github.com/ehuynh1106/tinyimagenet-transformers
The recent advances in image transformers have shown impressive results and have largely closed the gap between traditional CNN architectures. The standard procedure is to train on large datasets like ImageNet-21k and then finetune on ImageNet-1k. After finetuning, researches will often consider the transfer learning performance on smaller datasets such as CIFAR-10/100 but have left out Tiny ImageNet. This paper offers an update on vision transformers' performance on Tiny ImageNet. I include Vision Transformer (ViT) , Data Efficient Image Transformer (DeiT), Class Attention in Image Transformer (CaiT), and Swin Transformers. In addition, Swin Transformers beats the current state-of-the-art result with a validation accuracy of 91.35%. Code is available here: https://github.com/ehuynh1106/TinyImageNet-Transformers