Retrieving All Available Frequency Data For A Snp Using Ensembl Api Tools
2
6
Entering edit mode
10.4 years ago
Krisr ▴ 470

Hi,

I would like to retrieve all available SNP population frequency data, for ~5000 SNPs. I have used the API perl variation tools to access the ENsembl data for transcript features, however, I can't figure out how to retrieve frequency information for the SNP, there is little documentation in the API variation tutorial on this. Is anyone familiar with what routines can provide this info, or if there are any to do such a thing?

So, if I had a SNP rsXXXXXX, I would like to retrieve all population frequency data available, HapMap population, CEPH, etc.

Thanks!!

ensembl api variation snp frequency • 4.7k views
ADD COMMENT
1
Entering edit mode

Thanks, I will try this!

I realize I'm supposed to post responses in the comments, but this wouldn't fit :)

I was trying to piece together some code from the various API Tutorials.

I have the following, but am not sure where, or how, to call the population 'fetch_all_hapmap_populations'. I want to call this on the rs# to retrieve the data, but how do I link the variation object to the population object?

#!/usr/bin/perl -w
use strict;
use warnings;
use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
    -host => 'ensembldb.ensembl.org',
    -user => 'anonymous'
);

my $va_adaptor = $registry->get_adaptor('human', 'variation', 'variation'); #get the different adaptors for the different objects needed
my $vf_adaptor = $registry->get_adaptor('human', 'variation', 'variationfeature');
my $pa = $registry->get_adaptor("human","variation","population");

my $pop = $pa->fetch_all_HapMap_Populations();

my $variation = $va_adaptor->fetch_by_name('rs123');

 if (defined $variation)
{
    foreach my $allele (@{$variation->get_all_Alleles()}) 
    {
       print $pop->$allele->fetch_all_HapMap_Populations() . "\t" . $allele->allele . "\t" . $allele->frequency . "\n";
    }
}
ADD REPLY
0
Entering edit mode

I applied the above MySQL to a DBI interface with perl. I've made a loop to iterate through all SNPs and report back the freqs. Thanks again!!!

ADD REPLY
7
Entering edit mode
10.4 years ago

I'm not an expert on the mysql schema for ensembl, so use my answer with caution; The following SQL query seems to return the data you want:

mysql -u anonymous -h ensembldb.ensembl.org -P 5306 -D homo_sapiens_core_61_37f -A
  select distinct 
V.name,
S.handle,
A.allele,
A.frequency,
M.name,
F.allele_string
  from (
allele as A,
variation as V,
subsnp_handle as S,
variation_feature as F
  ) left join
  sample as M
on (M.sample_id = A.sample_id
  ) where 
V.variation_id=A.variation_id and
S.subsnp_id =A.subsnp_id and
F.variation_id=V.variation_id and 
V.name="rs3"
  order by 1,2,5;

results:

ADD COMMENT
1
Entering edit mode
ERROR 1146 (42S02): Table 'homo_sapiens_core_61_37f.allele' doesn't exist
ADD REPLY
1
Entering edit mode

The correct database name is homo_sapiens_variation_61_37 , instead of homo_sapiens_core_61_37f

ADD REPLY
0
Entering edit mode

Thanks Giovanni

ADD REPLY
0
Entering edit mode

I think the correct table is homo_sapiens_variation_61_37f, but am not sure why the table retrieving different values when I compared to 1000 genomes MAFs in dbSNP.

ADD REPLY
5
Entering edit mode
10.4 years ago
Bert Overduin ★ 3.7k

Hi,

This script should do the trick:

#!/usr/local/bin/perl
use strict;
use warnings;

use Bio::EnsEMBL::Registry;

my $reg = 'Bio::EnsEMBL::Registry';

$reg->load_registry_from_db(
    -host => 'ensembldb.ensembl.org',
    -user => 'anonymous'
);

# get a variation adaptor
my $va = $reg->get_adaptor("human", "variation", "variation");

# fetch the variation by name
my $v = $va->fetch_by_name('rs123');

# get all the alleles
my @alleles = @{$v->get_all_Alleles()};

foreach my $allele(@alleles) {
    print
        $v->name, "\t",
        $allele->allele, "\t",
        (defined($allele->frequency) ? $allele->frequency : "-"), "\t",
        (defined($allele->population) ? $allele->population->name : "-"), "\n";
}

Hope this helps.

Note that questions about Ensembl and the Ensembl API can also be posted to the Ensembl helpdesk or the Ensembl developers mailing list (how to subscribe to this list you can find here).

ADD COMMENT

Login before adding your answer.

Traffic: 1305 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6