Question

Obtained too many chip-seq peaks after diffbind, how to set a criteria to select more "significant" genes?

0

Entering edit mode

4.0 years ago

yepeh72919 ▴ 10

Hello all!

I have conducted some ChIP-seq analysis using diffbind to compare 2 different conditions, and the number of peaks obtained are very large (e.g. ~49,000 peaks). There are some repeats of genes in the list of peaks, but they are from different regions of the gene. I would like to do some downstream analysis (e.g. gene ontology) , but the number of peaks are way too large.

I have the following questions:

Should I conduct this kind of cutoff? The diffbind scores range from 1.5 to 6.
Is there a way to set a cutoff for the number of peaks for downstream analysis?

Something I can think of: set an arbitrary cutoff for diffbind score. Scores > 3.5 are selected for the analysis.

Another thing i can think of: ratio of peak height for condition 1 vs condition 2. This way, I can then select genes with height >1.5 fold in condition 1.

If peak height is a good way to obtain more significant genes, what tools do you recommend?

Thanks!

ChIP-Seq diffbind • 1.2k views

ADD COMMENT • link updated 4.0 years ago by Rory Stark ★ 2.0k • written 4.0 years ago by yepeh72919 ▴ 10

1

Entering edit mode

If number of significant regions is unexpectedly high be sure to use MA-plots in order to check if normalization is off-scale and many false-positives were produced. Actually one should always do that. Proper normalization should center the majority of regions somewhat at y ~ 0.

ADD REPLY • link 4.0 years ago by ATpoint 82k

score 1 · Answer 1 · 2020-05-04

1

Entering edit mode