关于Model-X的条件独立性测试的功能

论文标题

关于Model-X的条件独立性测试的功能

On the power of conditional independence testing under model-X

论文作者

Katsevich, Eugene, Ramdas, Aaditya

论文摘要

为了测试响应y和预测因子X的条件独立性（CI）给定Z，最近引入的Model-X（MX）框架一直是主动方法论研究的主题，尤其是在MX仿基及其在基因组全基因组关联研究中的成功应用。在本文中，我们研究了MX CI测试的功能，对机器学习的作用产生了定量见解，并提供了证据，以支持在实践中使用基于似然的统计数据。着眼于条件随机测试（CRT），我们发现其有条件的推理模式使我们能够将其重新制定为测试涉及X的条件分布的点null假设。Neyman-Pearson Lemma。因此，基于可能的统计量会产生最强大的CRT与点替代方案。我们还获得了MX仿制的相关最佳结果。从任意增长的协变量维度转换为一个渐近框架，我们根据机器学习算法的预测误差的测试统计算法的预测误差为基于其测试统计量的预测误差而得出了CRT对局部半参数替代方案的限制能力的表达式。最后，在假设只有X给定Z的前两个矩是已知的，对MX假设的显着弛豫，我们表现出具有均匀渐近I误差控制的无重采样测试。

For testing conditional independence (CI) of a response Y and a predictor X given covariates Z, the recently introduced model-X (MX) framework has been the subject of active methodological research, especially in the context of MX knockoffs and their successful application to genome-wide association studies. In this paper, we study the power of MX CI tests, yielding quantitative insights into the role of machine learning and providing evidence in favor of using likelihood-based statistics in practice. Focusing on the conditional randomization test (CRT), we find that its conditional mode of inference allows us to reformulate it as testing a point null hypothesis involving the conditional distribution of X. The Neyman-Pearson lemma then implies that a likelihood-based statistic yields the most powerful CRT against a point alternative. We also obtain a related optimality result for MX knockoffs. Switching to an asymptotic framework with arbitrarily growing covariate dimension, we derive an expression for the limiting power of the CRT against local semiparametric alternatives in terms of the prediction error of the machine learning algorithm on which its test statistic is based. Finally, we exhibit a resampling-free test with uniform asymptotic Type-I error control under the assumption that only the first two moments of X given Z are known, a significant relaxation of the MX assumption.

下载PDF全文

下载文献需遵守相关版权规定

论文标题