Question

peak detection: different q-value for different experiments?

1

Entering edit mode

8.3 years ago

nicolas.descostes ▴ 160

Hi all,

A simple question. Would you use different q-value thresholds for different peak detections in a dataset?

The chip-seq experiments all have the same input, I am performing the peak detection with macs2. I would rather use the same q-value for all experiments but since some are more noisy than others, would it be wrong to use a less stringeant q-value for those? Would you say that would be "artificially" improving the detection?

Thanks.

ChIP-Seq genome peak detection q-value • 4.8k views

ADD COMMENT • link updated 4.0 years ago by Tian ▴ 50 • written 8.3 years ago by nicolas.descostes ▴ 160

score 2 · Answer 1 · 2017-03-31

I don't know if it's the most ideal way to do it but yes I've done it. I would use the same cutoff if it's for 2 samples I want to directly compare to each other (like protein X in treated vs untreated) because hopefully the samples you want to compare worked equally well otherwise it's not a fair comparison. But if it's 2 separate experiments I will sometimes use different p or q values with MACS2. What I usually do is run MACS2 on the dataset using several p and q values (eg. q0.05, q0.01, p0.0001, p0.0005, etc.), convert all the output lists of peaks to .bed files, make .wig files from the ChIP-seq alignments, and load them all up on the UCSC Genome Browser so I can see the ChIP enrichment and see all the peaks that MACS2 called, then try decide which MACS2 settings most accurately detected the peaks.

I happened to have just made a relevant image for a presentation, here's MACS2 peak calling on a ChIP-seq dataset (IP and input) using 8 different p/q values. Which is the correct value to use? Some are obviously not stringent enough and others might be too stringent, I think it's pretty subjective.

MACS2

score 0 · Answer 2 · 2017-03-31

Thank you very much for your complete reply.

I indeed adopt the same strategies of using several q-value to make my decision. I am comparing two different experiments (two antibodies) and want to assess the overlap of it.

If I am doing a venndiagram based on the peaks detected, I obtain around 20% overlap of one mark, and 90% of the other. The 90% one is very noisy. If I plot a heatmap of the two marks based on the union of peaks, "visually", I should expect a 90% overlap for both. That is why I thought to lower the threshold for the 20% one.

Having said that, the justification of the thresholds is visual only, which I do not like that much. However, some could argue that we are investigating signal that can be assessed visually so this approach is valid, which is a good point. Others could argue that if I lower the threshold for both, I should stick to the same proportions and that not doing that, is considered "artificial fitting".

I feel that both arguments are valid and I feel that there is no answer to this problem. I was hoping that somebody could bring up a point to choose.