在表格深度学习中的数值特征的嵌入方式上

论文标题

在表格深度学习中的数值特征的嵌入方式上

On Embeddings for Numerical Features in Tabular Deep Learning

论文作者

Gorishniy, Yury, Rubachev, Ivan, Babenko, Artem

论文摘要

最近，类似变压器的深度体系结构在表格数据问题上表现出强烈的性能。与传统模型，例如MLP不同，这些体系结构将数值特征的标量值映射到高维嵌入之前，然后将其混合在主骨架中。在这项工作中，我们认为，数值特征的嵌入是表格DL中的自由度，它允许构建更强大的DL模型并与GBDT竞争一些传统上对GBDT的基准。我们首先描述了构建嵌入模块的两种概念上不同的方法：第一个基于标量值的分段线性编码，第二种是基于定期激活的。然后，我们从经验上证明，与基于常规块（例如线性层和relu激活）相比，这两种方法可以导致显着的性能提升。重要的是，我们还表明，嵌入数值特征对许多骨干不仅对变压器都是有益的。具体而言，在正确的嵌入后，简单的类似MLP的模型可以与基于注意力的架构相同。总体而言，我们重点介绍了数值特征的嵌入，这是一个重要的设计方面，具有进一步改进表格DL的潜力。

Recently, Transformer-like deep architectures have shown strong performance on tabular data problems. Unlike traditional models, e.g., MLP, these architectures map scalar values of numerical features to high-dimensional embeddings before mixing them in the main backbone. In this work, we argue that embeddings for numerical features are an underexplored degree of freedom in tabular DL, which allows constructing more powerful DL models and competing with GBDT on some traditionally GBDT-friendly benchmarks. We start by describing two conceptually different approaches to building embedding modules: the first one is based on a piecewise linear encoding of scalar values, and the second one utilizes periodic activations. Then, we empirically demonstrate that these two approaches can lead to significant performance boosts compared to the embeddings based on conventional blocks such as linear layers and ReLU activations. Importantly, we also show that embedding numerical features is beneficial for many backbones, not only for Transformers. Specifically, after proper embeddings, simple MLP-like models can perform on par with the attention-based architectures. Overall, we highlight embeddings for numerical features as an important design aspect with good potential for further improvements in tabular DL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题