General quantification of signal enrichment in ChIPseq experiments
4.6 years ago
Roman Hillje ▴ 40

Hey everybody,

I'm currently analysing ChIPseq data coming from paraffin-embedded samples. The reads we get from sequencing are a lot more dispersed throughout the genome compared to fresh tissue (using the same antibody). What I mean with dispersed is:

1) We get reads in regions in which there is no signal in the fresh tissue.

2) In the places we do see enrichment in the paraffin-embedded samples, the peak reaches a maximum overlap of ~20 reads, where the fresh tissue gives us ~150 reads.

I would like to assign a number to this. Obviously it depends on what you're measuring (histone marks, TF, etc.), but as a comparison between fresh and paraffin-embedded samples this could be a useful number to improve the ChIPseq protocol.

Does anybody know of something like this? As a first idea I imagine just a distribution of coverage. This would show you if there are regions with high coverage or not. But maybe there is a more sophisticated solution?

Thanks!

ChIP-Seq
4.6 years ago

FRiP would seem useful, though you might also use plotFingerprint and have it output the QC metrics so you can get the Jensen Shannon distance (preferably vs. the control). Either of those numbers would likely be useful for you.

I have used FRiP to estimate this problem before but I would like to avoid the step of peak calling because I feel like it should be able to do it without prior filtering of the signal. plotFingerprint is a great idea, thank you! I'll also have a look at the Jensen Shannon distance but I'm not very familiar with it yet.