annotating regions with annotate Peak and ucsc refGene gives different answers in some cases
3 months ago
pt.taklifi ▴ 60

Hello everyone . I have a list of regions mapped to GRCh 38 and I want to find the name of genes that map to them. first I tried annotatePeak function from ChIPseeker package , which return EntrezID of nearest gene as well as annotation (i.e, Promoter , Distal , Exon ,...) and then I converted geneIDs to gene names using getSYMBOL function from annotate package .

G.ranges<- as_granges(ranges , seqnames=seqnames , start=start , end=end )
txdb<- TxDb.Hsapiens.UCSC.hg38.knownGene
annotated<-annotatePeak(G.ranges , TxDb = txdb ,level = "gene"  , addFlankGeneInfo=TRUE )

a1<- as.data.frame(annotated@anno)
a1$symbol<- getSYMBOL(annotated@anno$geneId , data = 'org.Hs.eg.db')


I also tried this approach which uses ucsc refGene.

The problem is that using these 2 methods for some regions I get different gene names.

for example for
chr2:112541661_112542162 the first approach returns POLR1B whereas the second method using ucsc refGen returns LOC105373562.

I was wondering if there is a problem with my code using annotatePeak function ?

