I am doing DB analysis of ChIP-seq data using the csaw package. It seems I have a small trended bias and possible GC content bias in my data. Below are MA plots showing signal in the merged peaks using CPM normalisation (LHS) and Loess normalisation (RHS) to account for trended biases:
If I create MA plots coloured by GC content (showing only top/bottom 10% by GC content) there also appears to be a GC content bias (LHS) which can be fixed using CQN normalisation (RHS):
However, looking at the affect of CQN normalization (RHS) on all of the data, I'm not sure if it corrects the trended bias correctly (like the Loess normalization).
Also, when I look at the called differential peak regions in a genome browser using the CQN normalisation, some of the calls don't match what I can measure roughly by eye, suggesting the normalisation isn't appropriate. Both of these methods output an offset matrix which I supply to edgeR.
- Is there a way to combine the offsets produced by cqn and csaw to correct for both the trended bias and GC bias.
- Is there a better way to correct for these biases?