论文标题
Progpermute:微生物组鲁棒性动态表示的渐进置换
ProgPermute: Progressive permutation for a dynamic representation of the robustness of microbiome discoveries
论文作者
论文摘要
特征的识别是微生物组研究中的一项关键任务,这是微生物数据高维且异质性的事实而变得复杂的。被数据的复杂性掩盖,将信号与噪声分开的问题变得具有挑战性和麻烦。例如,当执行差异丰度测试时,多次测试调整往往是过度保守的,因为I型错误的概率(假阳性)随着大量假设而大大增加。此外,兴趣的分组效应可以被异质性掩盖。这些因素可能会错误地得出这样的结论,即微生物组组成没有差异。我们翻译并表示识别差异特征作为将信号与随机背景分开的动态布局的问题。我们建议进行渐进置换作为实现此过程并显示收敛模式的方法。更具体地说,我们在每种情况下逐步列出了微生物组样品的分组因子标签,并执行多个差分丰度测试。然后,我们将原始数据中最高特征的信号强度与它们在排列中的性能进行比较,并观察到这些顶部特征是从数据中确定的真实阳性,明显下降的趋势。我们已经将其开发为一个用户友好的Rhiny工具和R软件包,该工具由可以传达微生物组和分组因子之间的整体关联的功能,对发现的微生物的鲁棒性进行排名,并列出发现,其效果大小和个体丰度。
Identification of features is a critical task in microbiome studies that is complicated by the fact that microbial data are high dimensional and heterogeneous. Masked by the complexity of the data, the problem of separating signals from noise becomes challenging and troublesome. For instance, when performing differential abundance tests, multiple testing adjustments tend to be overconservative, as the probability of a type I error (false positive) increases dramatically with the large numbers of hypotheses. Moreover, the grouping effect of interest can be obscured by heterogeneity. These factors can incorrectly lead to the conclusion that there are no differences in the microbiome compositions. We translate and represent the problem of identifying differential features as a dynamic layout of separating the signal from its random background. We propose progressive permutation as a method to achieve this process and show converging patterns. More specifically, we progressively permute the grouping factor labels of the microbiome samples and perform multiple differential abundance tests in each scenario. We then compare the signal strength of the top features from the original data with their performance in permutations, and observe an apparent decreasing trend if these top features are true positives identified from the data. We have developed this into a user-friendly RShiny tool and R package, which consist of functions that can convey the overall association between the microbiome and the grouping factor, rank the robustness of the discovered microbes, and list the discoveries, their effect sizes, and individual abundances.