论文标题
用于因果发现的概率自动编码器
A probabilistic autoencoder for causal discovery
论文作者
论文摘要
本文解决了在两个相关变量之间找到因果方向的问题。提出的解决方案是建立其联合分布的自动编码器,并最大程度地相对于两个边缘分布的估计能力。结果表明,所产生的两个能力通常不能相等。这导致了因果发现的新标准:较高的容量与代表原因的分布的无约束选择一致,而较低容量反映了效应分布机制施加的约束。估计容量定义为自动编码器表示任意数据集的能力。正则化项迫使IT决定以更通用的方式(即保持较高的模型容量)来确定要建模的变量中的一个。因果方向是通过编码数据时遇到的约束来揭示的,而不是被测量为数据本身的属性。该想法是使用受限的玻尔兹曼机器实现和测试的。
The paper addresses the problem of finding the causal direction between two associated variables. The proposed solution is to build an autoencoder of their joint distribution and to maximize its estimation capacity relative to both the marginal distributions. It is shown that the resulting two capacities cannot, in general, be equal. This leads to a new criterion for causal discovery: the higher capacity is consistent with the unconstrained choice of a distribution representing the cause while the lower capacity reflects the constraints imposed by the mechanism on the distribution of the effect. Estimation capacity is defined as the ability of the auto-encoder to represent arbitrary datasets. A regularization term forces it to decide which one of the variables to model in a more generic way i.e., while maintaining higher model capacity. The causal direction is revealed by the constraints encountered while encoding the data instead of being measured as a property of the data itself. The idea is implemented and tested using a restricted Boltzmann machine.