when do we need to normalize for GC-content and/or length bias in RNA-Seq reads?
1
0
Entering edit mode
16 months ago
pilargmarch ▴ 110

Hi! This has been a conundrum for me these past months. There are some packages like cqn (conditional quantile normalization) and EDASeq that can be used to normalize for sample-specific gene GC content and/or length biases, which can alter functional enrichment analysis results.

My question is, when is it appropiate to use these normalization techniques? I have some GSEA results that change drastically after normalizing with cqn, going from 17 to 109 significant GO terms, but I'm not really sure if it's correct to do this.

Thanks for reading :)

gc-content normalization RNA-seq bias edaseq • 551 views
ADD COMMENT
1
Entering edit mode
16 months ago

To be honest, I don't think anyone knows what is right and what is wrong - it is a bit of a wild west out there, everyone swinging.

I would plot the distribution of the p-values, and generate heatmaps, and PCA plots to try to understand whether the process improved the data or introduced unwanted artifacts.

Try to explain the changes from the point of view of the changes you get in genes and error distribution you get, and not in terms of the GO terms' enrichment. ( will admit that I am not sure if these corrections are applied before the DE detection runs or after).

ADD COMMENT

Login before adding your answer.

Traffic: 2516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6