How to normalise Bulk RNA-seq data to account for transcript length, coverage depth, library size and batch effects?
Entering edit mode
10 weeks ago
DJHS • 0

I'm currently working on a project where I'm comparing aggregate measurements (mean, median, etc.) of expression data (RNA-seq) from different groups of genes across various samples with different characteristics (tissue type, health status, etc.). Additionally, the raw counts were collected from several different labs using various techniques.

Since I am conducting between-gene measurements, the data should be normalised to account for differences in transcript length and coverage depth (TPM, RPKM, FPKM). However, I am also interested in comparisons across samples based on tissue type and other factors. Therefore, the data should also be normalised to account for library size (TMM, quantile, etc.), and, as the data were collected from multiple sources, it should be corrected for batch effects.

I have read through many papers but am unsure and confused about how to proceed with the normalisation procedure starting with the raw counts. Can I simply string the methods together, starting with batch effect correction, followed by library size normalisation, and then the within-sample normalisations?

I would appreciate any insights or suggestions on this. Thanks

RNA-seq Normalisation • 1.1k views
Entering edit mode

Have you read DESeq2 (LINK) and EdgeR (LINK) vignettes and the original papers?


Login before adding your answer.

Traffic: 2360 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6