peak detection: different q-value for different experiments?
2
1
Entering edit mode
7.0 years ago

Hi all,

A simple question. Would you use different q-value thresholds for different peak detections in a dataset?

The chip-seq experiments all have the same input, I am performing the peak detection with macs2. I would rather use the same q-value for all experiments but since some are more noisy than others, would it be wrong to use a less stringeant q-value for those? Would you say that would be "artificially" improving the detection?

Thanks.

ChIP-Seq genome peak detection q-value • 3.8k views
ADD COMMENT
2
Entering edit mode
7.0 years ago
Mike ▴ 60

I don't know if it's the most ideal way to do it but yes I've done it. I would use the same cutoff if it's for 2 samples I want to directly compare to each other (like protein X in treated vs untreated) because hopefully the samples you want to compare worked equally well otherwise it's not a fair comparison. But if it's 2 separate experiments I will sometimes use different p or q values with MACS2. What I usually do is run MACS2 on the dataset using several p and q values (eg. q0.05, q0.01, p0.0001, p0.0005, etc.), convert all the output lists of peaks to .bed files, make .wig files from the ChIP-seq alignments, and load them all up on the UCSC Genome Browser so I can see the ChIP enrichment and see all the peaks that MACS2 called, then try decide which MACS2 settings most accurately detected the peaks.

I happened to have just made a relevant image for a presentation, here's MACS2 peak calling on a ChIP-seq dataset (IP and input) using 8 different p/q values. Which is the correct value to use? Some are obviously not stringent enough and others might be too stringent, I think it's pretty subjective.

MACS2

ADD COMMENT
0
Entering edit mode

Hi Mike:

A very good example for p/q threshold. My question is maybe a bit strange: can I do no filtering at all (so it will return me all peaks red in your plot). Then I do filtering myself manually? By checking your plot, I think filtering cutoff is just selections of peaks based on their intensity right?

I want to smooth across peaks in one region (for example a gene body). I think it's improper to just select peaks with q value 0.05 then smooth between them, instead, I think I should consider all peaks existing on it (no matter significant or not). My question post is this.

Best Tian

ADD REPLY
0
Entering edit mode
7.0 years ago

Thank you very much for your complete reply.

I indeed adopt the same strategies of using several q-value to make my decision. I am comparing two different experiments (two antibodies) and want to assess the overlap of it.

If I am doing a venndiagram based on the peaks detected, I obtain around 20% overlap of one mark, and 90% of the other. The 90% one is very noisy. If I plot a heatmap of the two marks based on the union of peaks, "visually", I should expect a 90% overlap for both. That is why I thought to lower the threshold for the 20% one.

Having said that, the justification of the thresholds is visual only, which I do not like that much. However, some could argue that we are investigating signal that can be assessed visually so this approach is valid, which is a good point. Others could argue that if I lower the threshold for both, I should stick to the same proportions and that not doing that, is considered "artificial fitting".

I feel that both arguments are valid and I feel that there is no answer to this problem. I was hoping that somebody could bring up a point to choose.

ADD COMMENT

Login before adding your answer.

Traffic: 3139 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6