论文标题
光谱正则化:序列建模的电感偏差
Spectral Regularization: an Inductive Bias for Sequence Modeling
论文作者
论文摘要
学习任务中各种形式的正规化努力为不同的简单概念而努力。本文提出了一种光谱正则化技术,该技术基于乔姆斯基层次结构中定义的简单性概念,将独特的电感偏置附加到序列建模上。从Hankel矩阵和常规语法之间的基本连接,我们建议使用Hankel矩阵的痕量标准,这是其等级的最紧密凸松弛,作为光谱正常器。为了应对汉克尔基质是双限体的事实,我们提出了一个无偏的随机估计量,以实现其痕量标准。最终,我们展示了关于tomita语法的实验结果,这些结果表现出光谱正则化的潜在优势并验证了提出的随机估计量。
Various forms of regularization in learning tasks strive for different notions of simplicity. This paper presents a spectral regularization technique, which attaches a unique inductive bias to sequence modeling based on an intuitive concept of simplicity defined in the Chomsky hierarchy. From fundamental connections between Hankel matrices and regular grammars, we propose to use the trace norm of the Hankel matrix, the tightest convex relaxation of its rank, as the spectral regularizer. To cope with the fact that the Hankel matrix is bi-infinite, we propose an unbiased stochastic estimator for its trace norm. Ultimately, we demonstrate experimental results on Tomita grammars, which exhibit the potential benefits of spectral regularization and validate the proposed stochastic estimator.