Question

Can we compare peaks of different transcription factors when they were obtained by different criteria (IDR/FDR)?

0

Entering edit mode

5.7 years ago

salamandra ▴ 550

I have ChIP-seq datasets of 3 different transcription factors (TFs) in the same cell type. Each of two TFs has two replicates, so IDR can be applied to determine significant peaks, but for one of TFs I only have one replicate, and so, can only select peaks based on p-value/FDR.

Is it ok to compare peaks of the first two TFs with peaks of the last TF although peaks were determined with different criteria (IDR vs FDR)?

By comparing I mean determining which genes are being targeted in common

ChIP-Seq IDR RNA-Seq • 1.8k views

ADD COMMENT • link updated 3.1 years ago by Biostar 20 • written 5.7 years ago by salamandra ▴ 550

score 1 · Answer 1 · 2018-08-01

1

Entering edit mode

5.7 years ago

nanaki_ksc ▴ 10

Comparing, yes. But beware of trying to quantify the differences between the 3 different experiments.

Not only have the peaks been obtained by different methods, but also the original data comes from experiments that used different antibodies with different specificity, possibly using quite different protocols, and the sequencing may have different coverage, etc. In ChIP-seq you can't say "this locus has twice as much of this TF than that other TF".

If you have the negative controls (input) you could call the peaks again using a tool like Zerone. It will call presence/absence of TF at each genomic window instad of giving you the coordinates of the peaks, but the results will be directly comparable between experiments.

ADD COMMENT • link 5.7 years ago by nanaki_ksc ▴ 10

0

Entering edit mode

Thank you. Your answer now brought one more question. Let me explain a little better my experiment layout

I have reprogrammed cells with 3 transcription factors (A, B and C) and have 6 experimental conditions:

condition C1: targets of A together with other TF (when cells were transfected with A,B and C)

condition C2: targets of A alone (when cells were transfected only with A)

condition C3: targets of B together with other TF

condition C4: targets of B alone (when cells were transfected only with B)

condition C5: targets of C together with other TF

condition C6: targets of C alone (when cells were transfected only with C)

For all conditions I have a ChIP and Input samples.

For conditions C1 to C4 I have two replicates of ChIP and two replicates of Input. And so, my plan is for each of these conditions to call peaks for replicate 1 (using ChIP and Input) and peaks for replicate 2 (using ChIP and Input) and then do IDR to identify common significant peaks between the two replicates.

For conditions C5 to C6 I only have one sample (no replicates) for ChIp and Input. So, I'll select significant peaks based on FDR.

Then I want to do a differential binding analysis (with MAnorm, DIME or DBChIP) between conditions C1 vs C2, C3 vs C4, C5 vs C6 to identify the targets that are more differentially bind by each TF when is together with other vs when is alone. With this I would get 3 lists of targets that would be converted into genes.

Finally, I would either intersect or join genes of the three lists.

Is this analysis ok, or does it conflict with what you said of not not being able to quantitatively compare TFs?

ADD REPLY • link 5.7 years ago by salamandra ▴ 550

score 0 · Answer 2 · 2018-07-31

If you are looking at which genes are targeted by TFs, you can confirm this by looking at the tracks for enrichment at the selected genes loci. IDR/p-value/FDR are very useful to identify the statistically significant peaks. There is no restriction which method to use, but try to be consistent. Since it is in the same cell type, simply you can call peaks and do the intersection using bedtools intersect to find the common regions bound by TFs.