Question: Peak calling using macs2?
0
gravatar for star
18 months ago by
star190
Netherlands
star190 wrote:

I have paired end ChIP-seq data with 101 bp and 2 biological replicates for each one. I have done peak calling with macs2 but I have some questions about it.

I also faced with an warning:

WARNING @ Thu, 07 Jun 2018 17:06:05: #2 Since the d (197) calculated from paired-peaks are smaller than 2*tag length, it may be influenced by unknown sequencing problem! 
WARNING @ Thu, 07 Jun 2018 17:06:05: #2 You may need to consider one of the other alternative d(s): 197 
WARNING @ Thu, 07 Jun 2018 17:06:05: #2 You can restart the process with --nomodel --extsize XXX with your choice or an arbitrary number. Nontheless, MACS will continute computing.
  1. I have added --nomodel --extsize 197 ; --nomodel --extsize 147 and --nomodel --extsize 202 (separately to macs2 command) and got the results without any warning? which one is more correct?

  2. are broad peaks extended of narrow peaks? if I apply intersect between them i should expect find 100% overlap between narrow peaks and broad ones?

  3. which kind of peak (narrow/broad) is proper for H3k27ac, H3k4me1, H3k4me3,H3k27me3 study?

  4. if there is no control group for using as background, can I use default parameters?

Thanks for any suggestion, in advance!

peak-calling peaks chip-seq macs2 • 1.3k views
ADD COMMENTlink modified 18 months ago by RamRS25k • written 18 months ago by star190
1

(1) In my experience, I received this warning when my ChIP signal was not very strong. MACS2 called very few peaks, and it made sense when I checked out the bigwig files for my samples. There was very little difference between my input sample and IP sample.

(2) I believe broad peaks just combines nearby peaks into larger peaks.

(3) I'm not familiar with those factors but narrow peaks are appropriate if they bind in very specific places on the chromatin. Broad peaks are good if the factor binds to many locations and has less specificity.

(4) I have never tried to call peaks without a control input sample. It's pretty important in ChIP-seq to provide background signal in some way.

Hope it helps

ADD REPLYlink written 18 months ago by goodez460
1

Let me add my few cents,

  1. Correct is something inconsistent. Just try to call peaks different --extsize and see how many peaks are overlaping with each other. A high percentage of peaks should overlap. (I did something like this last week when I saw the warning with my ChIP data and found large overlap, so I ignored that warning).

  2. Most likely. But I never tested it. (@goodez is correct).

  3. narrow peak will work fine. If the signal is spanned over a large region, those regions are also included in this peak file. Unless you are trying to associate peak length with some observations (e.g. like this), narrow peaks will be good enough.

  4. I tested this a couple of times. Most of the times, I found all peaks called with background are overlaping with peaks called without background. But the other way around is not true. When you don't have a background, MACS will call too many peaks which are actually not peaks. A simple workaround would be to use a background of other sample if they belong to one condition.

ADD REPLYlink modified 18 months ago • written 18 months ago by venu6.3k
1

Hello star!

It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/4448

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink written 18 months ago by Pierre Lindenbaum124k
5
gravatar for EagleEye
18 months ago by
EagleEye6.5k
Sweden
EagleEye6.5k wrote:
  • For most of the transcription factors (TFs ChIP-seq) peaks are sharp/narrow and for histone peaks are spread/broader.

  • If you plot the peaks over promoter region of genes, you will find that histone peaks getting deep drop at 1 to 3 bp region of TSS (center) due to nucleosome depletion/partial loss at TSS and nucleosome positioning over up- and downstream of promoter region for chromatin accessibility.

  • TFs are usually enriched over promoters (-TSS+).

  • It is important to use appropriate controls (example, Input DNA) for ChIP-seq analysis. The overrepresented regions due to sonication step during ChIP experiment can be avoided by using Input DNA as background.

ADD COMMENTlink modified 18 months ago • written 18 months ago by EagleEye6.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 695 users visited in the last hour