VITASD：自闭症谱系障碍面部诊断的强大视力变压器基线

论文标题

VITASD：自闭症谱系障碍面部诊断的强大视力变压器基线

ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis

论文作者

Cao, Xu, Ye, Wenqian, Sizikova, Elena, Bai, Xue, Coffee, Megan, Zeng, Hongwu, Cao, Jianguo

论文摘要

自闭症谱系障碍（ASD）是一种终生的神经发育障碍，全球患病率很高。由于缺乏公认的基线，因此在小儿患者的ASD面部分析领域的研究进展受到阻碍。在本文中，我们建议将视觉变压器（VIT）用于小儿ASD的计算分析。提出的模型（称为VITASD）从大型面部表达数据集中提取知识，并提供模型结构可传递性。具体而言，VITASD采用Vanilla VIT来从患者的面部图像中提取特征，并采用具有高斯工艺层的轻量级解码器，以增强ASD分析的鲁棒性。对标准ASD面部分析基准进行的广泛实验表明，我们的方法在ASD面部分析中优于所有代表性方法，而VITASD-L则实现了新的最新技术。我们的代码和预估计的模型可在https://github.com/irohxu/vitasd上找到。

Autism spectrum disorder (ASD) is a lifelong neurodevelopmental disorder with very high prevalence around the world. Research progress in the field of ASD facial analysis in pediatric patients has been hindered due to a lack of well-established baselines. In this paper, we propose the use of the Vision Transformer (ViT) for the computational analysis of pediatric ASD. The presented model, known as ViTASD, distills knowledge from large facial expression datasets and offers model structure transferability. Specifically, ViTASD employs a vanilla ViT to extract features from patients' face images and adopts a lightweight decoder with a Gaussian Process layer to enhance the robustness for ASD analysis. Extensive experiments conducted on standard ASD facial analysis benchmarks show that our method outperforms all of the representative approaches in ASD facial analysis, while the ViTASD-L achieves a new state-of-the-art. Our code and pretrained models are available at https://github.com/IrohXu/ViTASD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题