In short, I have a list of SNPs that I would like to map to the closest gene within 1000kb. I am using the biomaRt package in R/Bioconductor. I am successfully able to map my SNPs to Genes but only with biomaRt's default bp flanking region (I beleive that is 100kb). Here are my commands to get the Genes.
Also, I am using h19 build.
#Mart used to map SNPs to Ensembl Gene IDs grch37.snp = useMart(biomart="ENSEMBL_MART_SNP", host="grch37.ensembl.org", path="/biomart/martservice",dataset="hsapiens_snp") #Mart used to map Ensembl Gene IDs to Gene name grch37 = useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", path="/biomart/martservice", dataset="hsapiens_gene_ensembl") snpList <- studyResults$SNP #List of 5000 SNPs
1. Mapping SNPs to Ensembl Gene IDs
table1 <- getBM(attributes = c("refsnp_id", "ensembl_gene_stable_id"), filters = "snp_filter", values = snpList, mart = grch37.snp)
2. Mapping Ensembl Gene IDs to Gene names
table2 <- getBM(attributes = c("ensembl_gene_id", "external_gene_name","external_gene_source","variation_name","start_position","end_position","description"), filters = "ensembl_gene_id", values = table1$ensembl_gene_stable_id, mart = grch37)
3. Merge both tables
results <- merge(table1,table2, by.x = "ensembl_gene_stable_id", by.y="ensembl_gene_id", all.x=T)
So here's my question: How do I specify the region I want? Right now, biomaRt is mapping the SNP to the gene within 100kb, but I want to map the SNP to the gene within 1000kb.
If I can't do this in biomaRt, I welcome alternative solutions!