带有正则化的分解机器，用于稀疏特征相互作用

论文标题

带有正则化的分解机器，用于稀疏特征相互作用

Factorization Machines with Regularization for Sparse Feature Interactions

论文作者

Atarashi, Kyohei, Oyama, Satoshi, Kurihara, Masahito

论文摘要

分解机（FMS）是基于二阶特征相互作用的机器学习预测模型，而具有稀疏正则化的FMS称为稀疏FMS。这种正规化可以使功能选择，它为准确的预测选择了最相关的功能，因此它们可以有助于提高模型的准确性和解释性。但是，由于FMS使用二阶功能相互作用，因此特征的选择通常会导致所得模型中许多相关特征相互作用的丢失。在这种情况下，具有正则化的FMS专门设计用于特征互动选择，试图实现交互级的稀疏性，而不是仅仅是为了实现特征级别的稀疏性而仅用于功能选择。在本文中，我们提出了一种新的正规化方案，用于FMS中的特征交互选择。提出的正常化程序是特征交互矩阵的$ \ ell_1 $正规器的上限，该矩阵是根据FMS的参数矩阵计算得出的。对于特征相互作用选择，我们提出的正常化程序使特征相互作用矩阵稀疏，而无需限制现有方法施加的稀疏模式。我们还描述了提出的FMS的有效近端算法，并介绍了现有和新正规化的理论分析。此外，我们将讨论如何将我们的想法应用或扩展到更准确的特征选择以及其他相关模型，例如高阶FMS和All-Subsets模型。对合成和现实世界数据集的分析和实验结果显示了所提出方法的有效性。

Factorization machines (FMs) are machine learning predictive models based on second-order feature interactions and FMs with sparse regularization are called sparse FMs. Such regularizations enable feature selection, which selects the most relevant features for accurate prediction, and therefore they can contribute to the improvement of the model accuracy and interpretability. However, because FMs use second-order feature interactions, the selection of features often causes the loss of many relevant feature interactions in the resultant models. In such cases, FMs with regularization specially designed for feature interaction selection trying to achieve interaction-level sparsity may be preferred instead of those just for feature selection trying to achieve feature-level sparsity. In this paper, we present a new regularization scheme for feature interaction selection in FMs. The proposed regularizer is an upper bound of the $\ell_1$ regularizer for the feature interaction matrix, which is computed from the parameter matrix of FMs. For feature interaction selection, our proposed regularizer makes the feature interaction matrix sparse without a restriction on sparsity patterns imposed by the existing methods. We also describe efficient proximal algorithms for the proposed FMs and present theoretical analyses of both existing and the new regularize. In addition, we will discuss how our ideas can be applied or extended to more accurate feature selection and other related models such as higher-order FMs and the all-subsets model. The analysis and experimental results on synthetic and real-world datasets show the effectiveness of the proposed methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题