Question

Defining ChIP-seq target genes: is there a robust cut-off?

1

Entering edit mode

8.5 years ago

YanO ▴ 140

I am trying to identify the target genes of my transcription factor after ChIP-seq. I have called the peaks and I have also looked at the number of reads falling around the Transcription Start Site of each gene. I am wondering how to define if a gene is bound by my TF.

Does it make sense to do the following: Pick a cutoff value for number reads (per million) falling into (for example) +-2kb of the TSS. I've made heatmaps so I could estimate by eye the number of genes bound, and take the cutoff value from this, but this seems a bad idea. I also don't want to simply assign a peak to it's nearest gene and, provided it's close to the TSS, call the gene a target, that seems much too liberal. Similarly, looking at my heatmaps and picking e.g. 'the top 200 genes' also seems flimsy.

Is there a robust definition for whether a gene is a target or not, using ChIP-seq data?

Any advice would be most apreciated.

ChIP-Seq • 3.1k views

ADD COMMENT • link updated 8.5 years ago by Ido Tamir 5.2k • written 8.5 years ago by YanO ▴ 140

score 3 · Answer 1 · 2016-03-01

Have a look at the GREAT tool http://bejerano.stanford.edu/great/public/html/ and the paper http://www.ncbi.nlm.nih.gov/pubmed/20436461 where they outline different strategies to be more comprehensive in assigning peaks to genes. Whether you are comfortable with a more conservative approach is up to you. I guess it depends on the transcription factor and how often you find it, how robust the peaks are etc... You could even include TADs or CTCF regulatory boundaries if you are more of the adventurous type and actually have this data.