论文标题
带有嵌套采样的边缘化高斯工艺
Marginalised Gaussian Processes with Nested Sampling
论文作者
论文摘要
高斯工艺(GPS)模型是由核函数控制的感应偏见的功能上的丰富分布。学习是通过使用边际可能性作为目标优化内核超参数进行的。这种被称为II型最大似然(ML-II)的经典方法得出了超参数的点估计,并且继续是训练GPS的默认方法。但是,这种方法可能会低估预测性不确定性,并且容易过度拟合,尤其是在有许多超参数时。此外,基于梯度的优化使ML-II点估计值极易受到局部最小值的影响。这项工作提出了一个替代学习过程,其中使用嵌套采样(NS)将内核函数的超参数边缘化,这是一种非常适合从复杂的多模式分布中采样的技术。我们专注于具有光谱混合物(SM)类内核类别的回归任务,并发现量化模型不确定性的原则方法可导致一系列合成和基准测试数据集的预测性能的显着提高。在这种情况下,还发现嵌套采样可提供比哈密顿蒙特卡洛(HMC)的速度优势,被广泛认为是基于MCMC的推断中的金标准。
Gaussian Process (GPs) models are a rich distribution over functions with inductive biases controlled by a kernel function. Learning occurs through the optimisation of kernel hyperparameters using the marginal likelihood as the objective. This classical approach known as Type-II maximum likelihood (ML-II) yields point estimates of the hyperparameters, and continues to be the default method for training GPs. However, this approach risks underestimating predictive uncertainty and is prone to overfitting especially when there are many hyperparameters. Furthermore, gradient based optimisation makes ML-II point estimates highly susceptible to the presence of local minima. This work presents an alternative learning procedure where the hyperparameters of the kernel function are marginalised using Nested Sampling (NS), a technique that is well suited to sample from complex, multi-modal distributions. We focus on regression tasks with the spectral mixture (SM) class of kernels and find that a principled approach to quantifying model uncertainty leads to substantial gains in predictive performance across a range of synthetic and benchmark data sets. In this context, nested sampling is also found to offer a speed advantage over Hamiltonian Monte Carlo (HMC), widely considered to be the gold-standard in MCMC based inference.