论文标题

贸易:密度估计的变压器

TraDE: Transformers for Density Estimation

论文作者

Fakoor, Rasool, Chaudhari, Pratik, Mueller, Jonas, Smola, Alexander J.

论文摘要

我们提出了贸易,这是一种基于自发的架构,可通过连续和离散的有价值数据进行自动回归密度估算。我们的模型使用受惩罚的最大似然目标进行训练,这确保了密度估计的样品类似于训练数据分布。自我注意力的使用意味着该模型在自动回归过程中不需要保留有条件的足够统计数据,而不是每个协变量所需的。在标准表格和图像数据基准上,贸易与现有方法(如标准化流量估计器和经常性自动回报模型)相比,贸易的密度估计明显高得多。但是,在持有数据上的对数可能仅部分反映了这些估计在现实世界应用中的有用程度。为了系统地评估密度估计器,我们提出了一套任务,例如使用生成的样品,分布外检测以及对噪声的稳健性等任务,并证明在这些情况下贸易效果很好。

We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data. Our model is trained using a penalized maximum likelihood objective, which ensures that samples from the density estimate resemble the training data distribution. The use of self-attention means that the model need not retain conditional sufficient statistics during the auto-regressive process beyond what is needed for each covariate. On standard tabular and image data benchmarks, TraDE produces significantly better density estimates than existing approaches such as normalizing flow estimators and recurrent auto-regressive models. However log-likelihood on held-out data only partially reflects how useful these estimates are in real-world applications. In order to systematically evaluate density estimators, we present a suite of tasks such as regression using generated samples, out-of-distribution detection, and robustness to noise in the training data and demonstrate that TraDE works well in these scenarios.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源