How to use R biomaRt to get flanking sequence for a batch of SNPs
1
0
Entering edit mode
6.5 years ago
woofung • 0

Hello all,

I had a hard time trying to find a way to retrieve flanking sequence for a batch of SNPs. Here is the code I tried:

ensembl.snp = useEnsembl(biomart="snp", dataset="hsapiens_snp",GRCh=38)
s <- c('snp_filter'='rs429358', 'rs429337')
snp <- getBM(attributes=c('refsnp_id'
                      ,'minor_allele_freq'
                      ,'snp'
                      ,'ensembl_peptide_allele'
                      , 'allele'
                      , 'chr_name'
                      ,'mapweight'),
                filters = 'snp_filter'
              ,value=s

                , mart = ensembl.snp)
snp

And it returns like this: minor_allele_freq snp refsnp_id ensembl_peptide_allele allele chr_name mapweight 1 0.281550 %T/G% rs429337 T/G 8 1 2 0.150559 %T/C% rs429358 C/R T/C 19 1

But if I add "upstream_flank" to the attributes, like

snp <- getBM(attributes=c('refsnp_id'
                      ,'minor_allele_freq'
                      ,'snp'
                      ,'ensembl_peptide_allele'
                      , 'allele'
                      , 'chr_name'
                      ,'mapweight', 'upstream_flank'),
                filters = 'snp_filter'
              ,value=s

                , mart = ensembl.snp)

it always return this :

Error in getBM(attributes = c("refsnp_id", "minor_allele_freq", "snp",  : The query to the BioMart webservice returned an invalid result: the number of columns in the result table does not equal the number of attributes in the query. Please report this to the mailing list.

Anyone know how to solve this? Thanks

R SNP software error • 2.3k views
ADD COMMENT
0
Entering edit mode

I think that in getBM(), sequences have to be treated as filters, alternatively, you could use the getSequence() function. Also did you do as asked in the error message and report it to the biomaRt mailing list ?

ADD REPLY
0
Entering edit mode

I don't know how to. But by google, I found a post that some people had the same problem reported 4 years ago but without a solution.

ADD REPLY
0
Entering edit mode
6.5 years ago

I think using filters for flanking sequence with getBM goes like this (untested):

snp <- getBM(attributes=c('refsnp_id', 
'minor_allele_freq' ,
'snp',
'ensembl_peptide_allele', 
'allele', 
'chr_name', 
'mapweight'),
filters = c('snp_filter', 'upstream_flank'), 
value= list(s, 1000), 
mart = ensembl.snp)

The reason is that you need a way to specify the amount of sequence you need.

ADD COMMENT
0
Entering edit mode

It doesn't work because "upstream_flank" can only be found in attributes list but not as a filter.

ADD REPLY
0
Entering edit mode

Actually, the question has already been answered on Biostars here. It seems I was on the right track but what's additionally required is checkFilters=FALSE.

ADD REPLY
0
Entering edit mode

I use listFilters() to look for the filters in ENSEMBL_MART_SNP database, so it is different from the post you gave.. There is no any related filters for flanking sequence. Even though, I still tried. But as expected, R finds no upstream_flank filter.

I also tried checkFilters=FALSE before, it didn't help.

Here is the code I tested.

snp.dataset = useMart("ENSEMBL_MART_SNP",dataset="hsapiens_snp") #select dataset

d <- c('rs429358', 'rs429337') snp <- getBM(attributes=c('refsnp_id' ,'minor_allele_freq' ,'snp' ,'ensembl_peptide_allele' , 'allele' , 'chr_name' ) ,filters=c('snp_filter', 'upstream_flank') ,value=(d, 20) , mart = snp.dataset, checkFilters=F) snp

ADD REPLY
0
Entering edit mode
snp.dataset = useMart("ENSEMBL_MART_SNP",dataset="hsapiens_snp") #select dataset
d <- c('rs429358', 'rs429337') 
snp <- getBM(attributes=c('refsnp_id'
                      ,'minor_allele_freq'
                      ,'snp'
                      ,'ensembl_peptide_allele'
                      , 'allele'
                      , 'chr_name'
                      )
                ,filters=c('snp_filter', 'upstream_flank')
                ,value=(d, 20)
                , mart = snp.dataset, checkFilters=F)
snp
ADD REPLY

Login before adding your answer.

Traffic: 2532 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6