论文标题
从数据融合中学习仪器变量以进行治疗效果估计
Learning Instrumental Variable from Data Fusion for Treatment Effect Estimation
论文作者
论文摘要
大数据时代的出现带来了新的机会和挑战,以吸引数据融合的治疗效果,即从多个来源收集的混合数据集(每个来源都具有独立的治疗作业机制)。由于可能省略了源标签和未衡量的混杂因素,传统方法无法估计单个治疗分配概率并有效推断治疗效果。因此,我们建议重建源标签并将其建模为组仪器变量(GIV),以实现基于IV的回归以进行治疗效应估计。在本文中,我们概念化了这一思维方式,并开发了一个统一的框架(meta-em),以(1)将原始数据映射到表示空间,以构建分配的治疗变量的线性混合模型; (2)估计分布差异并建模为不同的治疗分配机制的GIV; (3)采用交替的培训策略,以迭代优化表示的表示和联合分布,以模拟IV回归的GIV。与最先进的方法相比,经验结果证明了我们的元EM的优势。
The advent of the big data era brought new opportunities and challenges to draw treatment effect in data fusion, that is, a mixed dataset collected from multiple sources (each source with an independent treatment assignment mechanism). Due to possibly omitted source labels and unmeasured confounders, traditional methods cannot estimate individual treatment assignment probability and infer treatment effect effectively. Therefore, we propose to reconstruct the source label and model it as a Group Instrumental Variable (GIV) to implement IV-based Regression for treatment effect estimation. In this paper, we conceptualize this line of thought and develop a unified framework (Meta-EM) to (1) map the raw data into a representation space to construct Linear Mixed Models for the assigned treatment variable; (2) estimate the distribution differences and model the GIV for the different treatment assignment mechanisms; and (3) adopt an alternating training strategy to iteratively optimize the representations and the joint distribution to model GIV for IV regression. Empirical results demonstrate the advantages of our Meta-EM compared with state-of-the-art methods.