Hello all,
I had a hard time trying to find a way to retrieve flanking sequence for a batch of SNPs. Here is the code I tried:
ensembl.snp = useEnsembl(biomart="snp", dataset="hsapiens_snp",GRCh=38)
s <- c('snp_filter'='rs429358', 'rs429337')
snp <- getBM(attributes=c('refsnp_id'
,'minor_allele_freq'
,'snp'
,'ensembl_peptide_allele'
, 'allele'
, 'chr_name'
,'mapweight'),
filters = 'snp_filter'
,value=s
, mart = ensembl.snp)
snp
And it returns like this: minor_allele_freq snp refsnp_id ensembl_peptide_allele allele chr_name mapweight 1 0.281550 %T/G% rs429337 T/G 8 1 2 0.150559 %T/C% rs429358 C/R T/C 19 1
But if I add "upstream_flank" to the attributes, like
snp <- getBM(attributes=c('refsnp_id'
,'minor_allele_freq'
,'snp'
,'ensembl_peptide_allele'
, 'allele'
, 'chr_name'
,'mapweight', 'upstream_flank'),
filters = 'snp_filter'
,value=s
, mart = ensembl.snp)
it always return this :
Error in getBM(attributes = c("refsnp_id", "minor_allele_freq", "snp", : The query to the BioMart webservice returned an invalid result: the number of columns in the result table does not equal the number of attributes in the query. Please report this to the mailing list.
Anyone know how to solve this? Thanks
I think that in getBM(), sequences have to be treated as filters, alternatively, you could use the getSequence() function. Also did you do as asked in the error message and report it to the biomaRt mailing list ?
I don't know how to. But by google, I found a post that some people had the same problem reported 4 years ago but without a solution.