Question: Obtaining list of snps using chromosome postion with BiomaRt
0
gravatar for daddy.mcgarry
3.6 years ago by
daddy.mcgarry0 wrote:

Hi,

I'm having difficulty using the getBM function. I'm trying to download the names of snps located within a certain region (of chromosome 15 for a particular transcript) I've tried several versions but to no avail. I did notice that the example in the biomart vignette did not work either (third example down).

snpmart <- useMart(host="www.ensembl.org", biomart="ENSEMBL_MART_SNP", dataset="hsapiens_snp")

snps <- getBM(attributes=c("refsnp_id","allele","chrom_start","chrom_strand"),
                        filters = c("chr_name","start","end"),
                        values = list(15,67065845,67195195), mart = snpmart)

snps <- getBM(attributes=c("refsnp_id","allele","chrom_start","chrom_strand"),
                        filters = c("chromosomal_region"),
                        values = list(1:67065845:67065900), mart = snpmart)

 getBM(c('refsnp_id','allele','chrom_start','chrom_strand'), 
          filters = c('chr_name','chrom_start','chrom_end'), 
          values = list(8,148350,148612), mart = snpmart)


"Error in getBM(c("refsnp_id", "allele", "chrom_start", "chrom_strand"),  : 
  Invalid filters(s): chrom_start, chrom_end 
Please use the function 'listFilters' to get valid filter names"
snp biomart R • 1.8k views
ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by daddy.mcgarry0

I'm talking to our BioMart team and I'll get back to you when I know more.

ADD REPLYlink written 3.6 years ago by Emily_Ensembl19k

Many thanks for this information - sorry to hear it is a big job for the variants.

ADD REPLYlink written 3.6 years ago by daddy.mcgarry0
0
gravatar for Neilfws
3.6 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

The answer is right there in the error message:

filters <- listFilters(snpmart)
filters[grep("start|end|strand", filters$name),]
          name  description
2        start        Start
3          end          End
4   band_start   Band Start
5     band_end     Band End
6 marker_start Marker Start
7   marker_end   Marker End
9       strand       Strand

Looks like you want start, end, strand without the chrom_ prefix. Furthermore, there is no key named allele and refsnp_id should probably be snp_filter.

You can also use the chromosomal_region filter e.g. 1:100:10000:-1 (chrom:start:end:strand).

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by Neilfws48k
0
gravatar for Emily_Ensembl
3.6 years ago by
Emily_Ensembl19k
EMBL-EBI
Emily_Ensembl19k wrote:

We have a correction to your third query, it should be:

getBM(c('refsnp_id','allele','chrom_start','chrom_strand'), 
      filters = c('chr_name','start','end'), 
      values = list(8,148350,148612), mart = snpmart)"

We generally have a problem with the variation mart, which is down to our variation database being so incredibly massive. This is not going to be a quick fix and is going to take a lot of work from our end.

You could instead consider using the filterVcf tool from bioconductor along with our VCF files: filterVcf https://www.bioconductor.org/packages/devel/bioc/vignettes/VariantAnnotation/inst/doc/filterVcf.pdf

Here are the VCF files: ftp://ftp.ensembl.org/pub/current_variation/vcf/homo_sapiens/

Another option would be VCF tools: http://vcftools.sourceforge.net/.

ADD COMMENTlink written 3.6 years ago by Emily_Ensembl19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 984 users visited in the last hour