How quantitative is Chip-seq? If it isn't then why do we need normalisation to per million mapped reads? Can we compare the signal across samples if normalised to million mapped reads?
I am currently analysing ChIP-seq datasets, and I've been confused by the different attitudes I have come across to determining exactly how quantitative chip-seq is. As far as I am aware, it is qualitative, not quantitative. Chip-seq allows you to tell where a peak is present in one tissue but absent in another, but not how much more of your protein is present in one tissue over the other. You need Chip-RX for the quantitative stuff.
If this is the case, then why in the Roadmap Epigenomics paper 'Integrative analysis of 111 reference human epigenomes' do they say:
"To avoid artificial differences in signal strength due to differences in sequencing depth, all consolidated histone mark data sets ... were uniformly subsampled to a maximum depth of 30 million reads (the median read depth over all consolidated samples). "
I understand why you might do this for replicates, but this is not what they are talking about here. I have seen only a few other papers normalise reads like this. I always thought it wasn't necessary because you shouldn't have been comparing them quantitatively anyway.
Why do so many programs have the option to normalise to signal per million mapped reads if we cant compare quantitatively across samples anyway?
Let's say I have ChIP-seq'd H3K27me3 in stem cells and differentiated cells - I assume I am not simply allowed to get the signal per million mapped reads, and '"subtract" one from the other to see how much more/less is present in the differentited cells?
Any thoughts would be appreciated, thanks!