Annotatepeaks Function From Homer
3
0
Entering edit mode
11.7 years ago
Dataminer ★ 2.8k

Hi!

I am using annotatepeaks.pl function from HOMER to annotate my peaks to the nearest refseq gene. If you have used the tool you will notice it generates 10 columns. I am interested (or the columns which make me interested) are distance to TSS, nearest promoter id, nearest RefSeq ID, and gene name.

Problem: At times the peak is annotated to the nearest promoter id with some distance to TSS, while the column with nearest RefSeq ID and gene name goes blank.

Since I am interested in finding the nearest gene to my peak hence I am interested in nearest Refseq Id and in gene name but if these two columns go blank and distance to nearest TSS does not, then I think I am in bit of trouble (or not)? or is it completely normal that in nearest Promoter Id we can have some id while nearest Refseq Id goes blank and gene name?

I am completely confused.

Thank you please try to answer.

genomics • 12k views
ADD COMMENT
1
Entering edit mode

I haven't seen a case like that, can you put an example file with 5-10 lines. There are cases with empty Nearest Ensembl, Gene Name, Gene Alias but a peak is annotated with a distance to TSS, a Nearest PromoterID is specified along with Nearest Refseq, which are in most of the cases same. Another case, is for that PromoterID there is not annotation for that gene in refseq database. Try looking at it in the UCSC genome browser to get a clue.

ADD REPLY
0
Entering edit mode

Hi Sukhdeep! Here is an example output file

Chr    St    Stp    TSS    Promoter_ID    Nearest_RefSeq    gene_name
chr1    84743800    84744829    -248 
chr2    120711360   120711771 -14111    NR_000034

In the first example only TSS is present rest of the columns are absent, while in second example TSS and promoter ID are present while rest is absent.

I have used hg18.

ADD REPLY
1
Entering edit mode
11.7 years ago

From the manual it says

By default, annotatePeaks.pl loads a file in the "/path-to-homer/data/genomes/<genome>/<genome>.tss" that contains the positions of RefSeq transcription start sites. It uses these positions to determine the closest TSS, reporting the distance (negative values mean upstream of the TSS, positive values mean downstream), and various annotation information linked to locus including alternative identifiers (unigene, entrez gene, ensembl, gene symbol etc.). This information is also used to link gene-specific information (see below) to a peak/region, such as gene expression.

This file.tss has the refseq accession number and position on the chromosome. Most of the cases when a peak intersects at any of these co-ordinates, it cross-intersects the refseq id with the gene alias in another file to give to this 10 column file. So, it might be a case when refseq annotation is present but its not it ucsc/ensembl. I dont have the hg18 with me but try to grep NR_000034 in the genome tss file and look up for the same position in browser to get the answer.

In the first case, it subtracted from somewhere but there is no linked annotation and to have the best answer, contact Chris Benner who made the tool.

Cheers

ADD COMMENT
0
Entering edit mode

Sukhdeep: Thank you. I have written to Chris Benner (still waiting for his reply). I did use ucsc genome browser to see if their was a gene (yes, a gene was present). Anyways, I will wait for the reply from chris. Will keep you all updated.

ADD REPLY
0
Entering edit mode
11.7 years ago

As Sukhdeep says and I will add it here just because I think it is the right answer, I think the annotation for genes are lacking and it is possible to have a promoter without a correspondingly annotated gene, or one sufficiently close to be included in the report.

ADD COMMENT
0
Entering edit mode
8.6 years ago
bwshi2012 • 0

I met the same problem. I also sent Chris Benner a message, but without any reply.

I use mm9, and the code I put in is

annotatePeaks.pl wt2exp3_peaks.bed data/genomes/mm9 >wt2exp3_peakannotate.xls

and it gives me a result like the following:

Does anyone know how to solve this problem?

ADD COMMENT

Login before adding your answer.

Traffic: 1507 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6