Query genomic position based on rsID in BiomaRt
1
1
Entering edit mode
4.3 years ago
Yean ▴ 140

Hi all,

I have queried genomic position based on rsID of SNPs in BiomaRt, and I got some problem on chromosome number in some SNPs. These are the code I have done in R.

#SNP list
sumstat <- fread("../../MGRS/GRP_shiny/effect_size/wGRS/DM.txt")[,1] %>% as.list(.)

#Select mart
ensembl <- useEnsembl("snp",dataset = "hsapiens_snp",GRCh = "37")

#get genomic position
SNPs <- getBM(attributes=c("refsnp_id",
                   "chr_name",
                   "chrom_start",
                   "chrom_end"),
    filters ="snp_filter", values =sumstat, mart = ensembl, uniqueRows=TRUE)

The problem is that some SNPs appeared to map in multiple chromosomes. For example in rs972283, it was mapped in both chromosome 7 and HG1308_PATCH (i dont know what chromosome it is), and BiomaRt usually reported rs972283 was in HG1308_PATCH chromosome as well as other SNPs (rs1092393).

49  rs7041847            9     4287466   4287466
50 rs17584499            9     8879118   8879118
51 rs10811661            9    22134094  22134094
52 rs13292136            9    81952128  81952128
53 rs10923931 HG1292_PATCH   120517959 120517959
54   rs972283 HG1308_PATCH   130445749 13044574

So, my question is "are there any way to let BiomaRt to report the chromosome of queried SNPs only in 1-22 instead of HG1292_PATCH and HG1308_PATCH ?"

Thanks

BiomaRt R SNPs Genomic position • 2.8k views
ADD COMMENT
3
Entering edit mode
4.3 years ago
GenoMax 141k

You could exclude the patch entries afterwards but keep in mind that the patches are valid sequence data. They have alternate representation of loci found in the normal haploid assembly. See more here.

ADD COMMENT

Login before adding your answer.

Traffic: 2003 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6