论文标题

一种贝叶斯非参数方法,用于鉴定具有协变量的多群微生物组数据中差异丰富的分类单元

A Bayesian Nonparametric Approach for Identifying Differentially Abundant Taxa in Multigroup Microbiome Data with Covariates

论文作者

Sachdeva, Archie, Datta, Somnath, Guha, Subharup

论文摘要

在过去的二十年中,科学研究确立了微生物组在疾病和健康中的核心作用。差异丰度分析旨在鉴定与由疾病亚型,地理区域或环境状况等因素定义的样本组相关的微生物类群。反过来,结果有助于临床从业者和研究人员诊断疾病,并更有效地发展治疗。然而,由于高维,稀疏性,构图和共线性,微生物组数据分析具有独特的挑战。在存在协变量的情况下,统一的统计方法迫切需要差异分析。我们开发了零膨胀的贝叶斯非参数(ZIBNP)方法,该方法符合这些多重挑战。该提出的技术灵活地适应了独特的数据特征,在检查框架中铸造了零比例的很大比例,并通过利用半参数中国餐厅进程的减少尺寸的属性来减轻高差异和共线性。此外,ZIBNP方法将微生物组采样深度与推理精度联系起来,同时适应微生物组数据的组成性质。通过在寄生虫(CAMP)和全球肠道微生物组数据集中对犬微生物组的模拟研究和分析,我们证明了ZIBNP的准确性与在协变量存在下进行差异丰度分析的既定方法相比。

Scientific studies in the last two decades have established the central role of the microbiome in disease and health. Differential abundance analysis seeks to identify microbial taxa associated with sample groups defined by a factor such as disease subtype, geographical region, or environmental condition. The results, in turn, help clinical practitioners and researchers diagnose disease and develop treatments more effectively. However, microbiome data analysis is uniquely challenging due to high-dimensionality, sparsity, compositionally, and collinearity. There is a critical need for unified statistical approaches for differential analysis in the presence of covariates. We develop a zero-inflated Bayesian nonparametric (ZIBNP) methodology that meets these multipronged challenges. The proposed technique flexibly adapts to the unique data characteristics, casts the high proportion of zeros in a censoring framework, and mitigates high-dimensionality and collinearity by utilizing the dimension-reducing property of the semiparametric Chinese restaurant process. Additionally, the ZIBNP approach relates the microbiome sampling depths to inferential precision while accommodating the compositional nature of microbiome data. Through simulation studies and analyses of the CAnine Microbiome during Parasitism (CAMP) and Global Gut microbiome datasets, we demonstrate the accuracy of ZIBNP compared to established methods for differential abundance analysis in the presence of covariates.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源