Question

the peaks of ChIPseq are few

0

Entering edit mode

5.1 years ago

Francis ▴ 20

Hi,

I would like to study H3K27ac modification in some stages. I have the below bam files and the number is read counts.

113,664,782   A.bam
 87,305,342   B.bam 
 83,029,416   C.bam
 49,539,212   D.bam

I used MACS2 to call peak:

macs2 callpeak -t $input -q 0.05 -f BAM -g hs -B --outdir $output -n $sample

and I get the peaks:

9065 A_peaks.narrowPeak
4873 B_peaks.narrowPeak
11080 C_peaks.narrowPeak
3521 D_peaks.narrowPeak

So the question is : why are the peak numbers so few? Is the sequencing depth enough or other reasons?

Thanks.

ChIP-Seq next-gen call peak • 2.5k views

ADD COMMENT • link updated 5.1 years ago by Friederike 8.9k • written 5.1 years ago by Francis ▴ 20

score 1 · Answer 1 · 2019-03-09

1

Entering edit mode

5.1 years ago

GouthamAtla 12k

Are those read counts after a thorough filtering (multimapped, deduplicated, blacklisted regions etc ) ?

The number of peaks doesn't depend on number of reads. It depends on the quality of library prep. If you have lot of background, you may end up with very few peaks. You have to create bigwig files ( like using bamCoverage) and look at them in ucsc genome browser or IGV to get an idea of how good the data is.

ADD COMMENT • link 5.1 years ago by GouthamAtla 12k

0

Entering edit mode

Thanks for your reply.

Yes, those counts are filtering (quality > 30 and remove duplicated, but not blacklisted regions).

I do not use control (or INPUT) bam. The background is OK (maybe) because I can see the clear and same peaks in repeat data. (the relative high of peak is same among different samples, I use autoscale in IGV). Maybe I can use the same scale.

The question is that: I would like to study the k27ac dynamic in the process from A to D. Could it be to compare them directly if they have different peaks?

ADD REPLY • link 5.1 years ago by Francis ▴ 20

0

Entering edit mode

Different peaks means gain or loss of peaks from A to D, which could be biologically true but you need to have replicates and perform a differential analysis like using DESeq2.

ADD REPLY • link 5.1 years ago by GouthamAtla 12k

0

Entering edit mode

Yes, I hope the change is true, but it is huge. So I am suspicious.

Thanks again and I will check the replicates.

ADD REPLY • link 5.1 years ago by Francis ▴ 20

score 1 · Answer 2 · 2019-03-10

1

Entering edit mode

5.1 years ago

Friederike 8.9k

The number of peaks will depend on many factors (including sequencing depth to a certain degree), but as geek_y pointed out, one of the most important factors is how well the enrichment step actually worked (and how much background noise you have in your input sample). If the read numbers you've shown are for filtered reads (applying the filters geek_y mentioned) in a mammalian sample, you should not have to worry about sequencing depth, H3K27ac is expected to give fairly narrow, sharp enrichments, similar to H3K4me3.

The CHANCE paper by Diaz et al has a good discussion of how the IP strength can be assessed; you could either use the CHANCE tool or the "fingerprint" that is the deepTools implementation of the CHANGE IP strength measure. A more comprehensive discussion of reasons for why you might not be seeing the expected numbers of enrichment can be found in this review by Meyer & Liu -- in brief: there are numerous sources of bias that may distort/overwhelm your signal of interest.

ADD COMMENT • link 5.1 years ago by Friederike 8.9k

0

Entering edit mode

Thanks for your reply.

I will read the papers and try to use the deeptools. But as the above: The question is that: I would like to study the k27ac dynamic in the process from A to D. Could it be to compare them directly if they have different peaks through the annotated genes?

Thanks.

ADD REPLY • link 5.1 years ago by Francis ▴ 20

0

Entering edit mode

You can study it, for sure, but be careful with the conclusions you're drawing. You do not have replicates, for starters, so how will you determine whether the peak number fluctuations are reflective of technical noise (e.g. sample prep) or biologically meaningful? Particularly if you see that the enrichment scores/fingerprints are very different between the samples, I'd be super cautious.

ADD REPLY • link 5.1 years ago by Friederike 8.9k

0

Entering edit mode

Thanks. I am suspicious for this, too. I will check the replicates and randomly selected the same reads number for downstream analysis. I will verify the biologically meaningful or other reasons.

Thanks for your advice!

ADD REPLY • link 5.1 years ago by Francis ▴ 20