Changing distance to TSS (tssRegion in ChIPseeker) during peak annotation in ChIP-seq analysis does not change the total number of genes associated with the peaks
11 months ago
vyazar ▴ 50


I have an urgent question, which is rather straightforward for experts I am sure;

In ChIP-seq analysis I changed the argument "tssRegion" - distance to the nearest TSS - from (-300, 300) to -(3000, 3000), and then even to (-10000,10000) in ChIPSeeker at the peak annotation step but the number of genes assigned to the called peaks is always the same. Then, I used a different peak annotator (ChIPpeakAnno) with tssRegion argument (-5000, 5000) and again the number of genes assigned to the called peaks is the same. It seems like all peak annotators ultimately assign one peak to a single gene so if the number of peaks called does not change, the number of genes annotated remains the same irrespective of the range of tssRegion. It also appears that when we increase the range of tssRegion during annotation, the annotation algorithm looks for genes to assign to a given peak just in a wider range. The number of genes assigned to the called peaks does NOT change as long as each peak in bed file is annotated. Only those peaks with no genes assigned previously have now a higher chance of getting annotated by the annotator when the "tssRegion" is larger. That is why the number of genes assigned to the called peaks is robust in my case because all the peaks are already annotated. Am I right or missing something BIG in here?

Thank you!

The basic principle for ChIP-seq peaks annotation is to identify the nearest TSS (genes). so even though you increase the range of window upstream or downstream of TSS the number of annotated genes will remain constant. But In case you increase it to 10000, one peak might be annotated to several genes, but the total number of genes would remain similar.


