离散潜在变量模型的贝叶斯主动学习

论文标题

离散潜在变量模型的贝叶斯主动学习

Bayesian Active Learning for Discrete Latent Variable Models

论文作者

Jha, Aditi, Ashwood, Zoe C., Pillow, Jonathan W.

论文摘要

主动学习旨在减少适合模型参数所需的数据量，从而在现代机器学习中形成重要类别。但是，过去的积极学习工作在很大程度上忽略了潜在变量模型，这些模型在神经科学，心理学以及其他各种工程和科学学科中都起着至关重要的作用。在这里，我们通过为离散潜在变量回归模型提出一个新的框架来解决这一差距。我们首先将方法应用于称为“线性回归的混合物”（MLR）的一类模型。众所周知，主动学习对线性高斯回归模型没有任何优势，但我们使用Fisher信息来分析表明，积极的学习可以为这种模型的混合物带来很大的收益，并且我们使用模拟和现实世界数据来验证这种改进。然后，我们考虑由隐藏的马尔可夫模型（HMM）具有广义线性模型（GLM）观测值的强大类别结构化的潜在变量模型，该模型最近已用于从动物决策数据中识别离散状态。我们表明，我们的方法大大减少了拟合GLM-HMM所需的数据量，并且比基于变异和摊销推断的多种近似方法的表现。因此，对潜在变量模型的Infomax学习为表征时间结构的潜在状态提供了强大的功能，并在神经科学及其他方面具有多种应用。

Active learning seeks to reduce the amount of data required to fit the parameters of a model, thus forming an important class of techniques in modern machine learning. However, past work on active learning has largely overlooked latent variable models, which play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines. Here we address this gap by proposing a novel framework for maximum-mutual-information input selection for discrete latent variable regression models. We first apply our method to a class of models known as "mixtures of linear regressions" (MLR). While it is well known that active learning confers no advantage for linear-Gaussian regression models, we use Fisher information to show analytically that active learning can nevertheless achieve large gains for mixtures of such models, and we validate this improvement using both simulations and real-world data. We then consider a powerful class of temporally structured latent variable models given by a Hidden Markov Model (HMM) with generalized linear model (GLM) observations, which has recently been used to identify discrete states from animal decision-making data. We show that our method substantially reduces the amount of data needed to fit GLM-HMM, and outperforms a variety of approximate methods based on variational and amortized inference. Infomax learning for latent variable models thus offers a powerful for characterizing temporally structured latent states, with a wide variety of applications in neuroscience and beyond.

下载PDF全文

下载文献需遵守相关版权规定

论文标题