Question: ChIP-seq Input replicate correlation
0
gravatar for EagleEye
4 days ago by
EagleEye3.8k
Sweden
EagleEye3.8k wrote:

Hello everyone,

I have ChIP-seq samples with replicates for the same experiment. I have processed ChIP-seq as follows,

Steps done:

  • Bowtie and Bowtie2 (50 bp read length, used both outputs for further processing)
  • picard (duplicate removal)
  • NGSUtils (BAMutils, clean with Black list and MAPQ 30)
  • Deeptools (fingerprint, BAMsummary and correlation)

Query:

I am planning to do peak calling using this replicate samples (2 x treatment and 2 x Input) with MACS2. But when I check the correlation (pearson) between Input replicate samples (S1), it gives 0.61 (highlighted with box) unlike in treatment replicates it was above 0.85 [refer attached figure below].

Is it fine to consider S1_InputR1 and S1_InputR2 as replicates when doing peak calling with MACS2 ? [refer attached figure below]

Please give some suggestion.

enter image description here

chip-seq • 99 views
ADD COMMENTlink modified 4 days ago by Ian4.8k • written 4 days ago by EagleEye3.8k

What does the Spearman's correlation look like?

ADD REPLYlink written 4 days ago by Devon Ryan65k

With Spearman correlation it got reduced to 0.44 for Input samples (S1_InputR1 vs S1_InputR2) but not for treatment samples (0.84).

ADD REPLYlink written 4 days ago by EagleEye3.8k

Interesting, do the fingerprints also look different?

ADD REPLYlink written 4 days ago by Devon Ryan65k

Here is the fingerprint for above samples,

enter image description here

The Inputs prepared during two different experiments looks much closer (S1_InputR1 and S2_InputR1) than the Inputs from the same experiments (S1_InputR1 and S1_InputR2).

ADD REPLYlink modified 4 days ago • written 4 days ago by EagleEye3.8k

Interesting, S1_InputR2 also covers ~5% less of the genome. From the graph, it looks like the depth might be a bit lower for it, but I'm guessing it's not hugely different. If you have a recent-ish version of deepTools, it'd be interesting to use --outQualityMetrics something.txt --JSDsample S1_InputR1.bam to see what the "synthetic JSD" is. This is the Jensen-Shannon divergence between a sample's coverage distribution and what would be seen from an ideal input sample sequenced to the same depth. Lower is better and I suspect your S1_InputR2 will have a notably higher value, which indicates that something went wrong at some point (possibly too many PCR cycles?) and maybe the sample should be excluded.

ADD REPLYlink written 4 days ago by Devon Ryan65k
1
gravatar for Ian
4 days ago by
Ian4.8k
University of Manchester, UK
Ian4.8k wrote:

Well, they are by definition your replicates, even if they don't look particularly similar. The difference between the two inputs might be down to coverage, unless they are already normalised.

You could run S1_input (as a pseudo ChIP sample) against the S2_input (as an input), and visa versa, in MACS2, which would show you whether you would be false positive peaks from the inputs alone.

Just a thought.

ADD COMMENTlink written 4 days ago by Ian4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1265 users visited in the last hour