论文标题

具有功能数据的现代多重插补

Modern Multiple Imputation with Functional Data

论文作者

Rao, Aniruddha Rajendra, Reimherr, Matthew

论文摘要

这项工作考虑了将功能模型与稀疏和不规则采样的功能数据拟合的问题。它克服了最先进的方法的局限性,该方法面临着更复杂的非线性模型的主要挑战。当前,除非样本量随着每条曲线观察到的点的数量迅速增长,否则这些模型中的许多模型都无法始终如一地估算,而数值显示,具有更现代的多个插补方法的修改方法可以产生更好的估计。我们还提出了一种新的插补方法,将{\ it Missforest}的思想与{\ it local Linear Forest}相结合,并将其性能与{\ it pace}和其他几种多元多个插补方法进行比较。这项工作是由一项关于戒烟的纵向研究的动机,其中宾夕法尼亚州立大学的电子健康记录(EHR)允许收集大量数据,并采样高度可变。为了说明我们的方法,我们探讨了复发与舒张压之间的关系。我们还考虑各种具有不同稀疏度的模拟方案来验证我们的方法。

This work considers the problem of fitting functional models with sparsely and irregularly sampled functional data. It overcomes the limitations of the state-of-the-art methods, which face major challenges in the fitting of more complex non-linear models. Currently, many of these models cannot be consistently estimated unless the number of observed points per curve grows sufficiently quickly with the sample size, whereas, we show numerically that a modified approach with more modern multiple imputation methods can produce better estimates in general. We also propose a new imputation approach that combines the ideas of {\it MissForest} with {\it Local Linear Forest} and compare their performance with {\it PACE} and several other multivariate multiple imputation methods. This work is motivated by a longitudinal study on smoking cessation, in which the Electronic Health Records (EHR) from Penn State PaTH to Health allow for the collection of a great deal of data, with highly variable sampling. To illustrate our approach, we explore the relation between relapse and diastolic blood pressure. We also consider a variety of simulation schemes with varying levels of sparsity to validate our methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源