Annotate .bed file with gene names and exon
2
0
Entering edit mode
4.8 years ago
mariab • 0

I have a bed file with the following structure (1 to 6 indicate the row number of the bed):

 chr.number       start       stop         V4
----------
1 chr1  4131635  4131815  rs3936238

2 chr1 11489587 11489767   rs877309

3 chr1 21652120 21652300   rs213028

4 chr1 25277819 25278166 rs11249206

5 chr1 27022864 27022985  NM_139135

6 chr1 27023022 27023312  NM_139135

I would like to annotate it in columns 5 and 6 with the following exonic information, preferably in R.

gene symbol

Exon number

Example desired output:

chr.number  start   end GEN_SYMBOL  exon

chr13   32972272    32972941    BRCA2   exon27

for the entire bed file. If possible also start and end position for each gene and each exon as well.

I have tried with biomaRt, but I do not know how to get all those filters as output. Additionally, I only found out how to annotate with the ENSEMBL annotation and not the one I need...

Thank you in advance!

R • 3.9k views
ADD COMMENT
1
Entering edit mode
4.8 years ago
2nelly ▴ 310

Hi mariab,

For me the easiest solution would be to use bedtools intersect function.

You can intersect your bed file with a gtf file(clean it first by keeping only exons coordinates) and get the 2 extra columns you want.

In case you don t have a gtf file, you can obtain it (mouse example) using the code below:

wget http://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/refGene.txt.gz ./
gzip -d refGene.txt.gz
cut -f 2- refGene.txt > mm10refGene.input
genePredToGtf file mm10refGene.input mm10refGene.gtf
sort -V -k1,1 -k4,4 -k5,5 mm10refGene.gtf > mm10refGene
mv mm10refGene mm10refGene.gtf
rm mm10refGene.input refGene.txt refGene.txt.gz
ADD COMMENT
0
Entering edit mode
4.8 years ago
Lila M ★ 1.2k

Hope this help

library(biomaRt) 

mart <- useDataset("mmusculus_gene_ensembl", useMart("ensembl"))

annotated <- getBM(filters= "yourfilter", 
attributes=  c("chromosome_name", "exon_chrom_start", "exon_chrom_end", "strand", "ensembl_gene_id","ensembl_exon_id"),
values=yourvalue, mart= mart)
ADD COMMENT

Login before adding your answer.

Traffic: 1990 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6