小RNA测序的深度归一化：使用数据和生物学选择合适的方法

论文标题

小RNA测序的深度归一化：使用数据和生物学选择合适的方法

Depth Normalization of Small RNA Sequencing: Using Data and Biology to Select a Suitable Method

论文作者

Düren, Yannick, Lederer, Johannes, Qin, Li-Xuan

论文摘要

深度测序已成为生物医学研究中最受欢迎的转录组分析工具之一。尽管存在大量的计算方法来“正常化”测序数据以消除由于实验处理而导致的样本间变化，但尚无共识，最适合给定数据集的归一化。为了解决这个问题，我们开发了“ DANA” - 一种基于生物学动机和数据驱动指标的microRNA测序数据的标准化方法的性能。我们的方法利用了MicroRNA的表达模式和染色体聚类的众所周知的生物学特征，以同时评估（1）如何有效地归一化消除了处理伪影，以及（2）恰当地归一化如何保存生物学信号。使用DANA，我们确认在不同数据集的八种常规归一化方法的性能差异很大，并为选择手头数据选择合适的方法提供了指导。因此，应将其作为MicroRNA测序数据分析的常规预处理步骤（在归一化之前）。 DANA在R中实施，并在https://github.com/lxqin/dana上公开获得。

Deep sequencing has become one of the most popular tools for transcriptome profiling in biomedical studies. While an abundance of computational methods exists for "normalizing" sequencing data to remove unwanted between-sample variations due to experimental handling, there is no consensus on which normalization is the most suitable for a given data set. To address this problem, we developed "DANA" - an approach for assessing the performance of normalization methods for microRNA sequencing data based on biology-motivated and data-driven metrics. Our approach takes advantage of well-known biological features of microRNAs for their expression pattern and chromosomal clustering to simultaneously assess (1) how effectively normalization removes handling artifacts, and (2) how aptly normalization preserves biological signals. With DANA, we confirm that the performance of eight commonly used normalization methods vary widely across different data sets and provide guidance for selecting a suitable method for the data at hand. Hence, it should be adopted as a routine preprocessing step (preceding normalization) for microRNA sequencing data analysis. DANA is implemented in R and publicly available at https://github.com/LXQin/DANA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题