Obtaining variant allele frequencies with the Ensembl API
1
0
Entering edit mode
3.0 years ago

Dear Community,

I hope you have been well these complicated days and weeks.

I am writing this post because I have a question about obtaining allele frequencies for a number of variants with the Ensembl API. I would like to obtain what is seen in the Ensembl browser, for example:

http://www.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=16:89919209-89920209;v=rs1805007;vdb=variation;vf=729218131

Where we can see that variant rs1805007 has a lot of information about alleles C and T at the population level.

However, when I try to get this information with the Ensembl API, I am unable to go from the Variation object to the Allele object (array of Allele objects is empty). This is my code:

#!/usr/bin/perl

use strict;
use warnings;
use Bio::EnsEMBL::Registry;

#connect to Ensembl
my $registry = "Bio::EnsEMBL::Registry";
$registry -> load_registry_from_db(
        -user => 'anonymous',
        -host => 'useastdb.ensembl.org',
        -species => 'homo_sapiens' );

#get adaptors
my $variation_adaptor = $registry -> get_adaptor( 'human', 'variation', 'variation' );
my $variation = $variation_adaptor -> fetch_by_name( "rs1805007" );

#print variant info
my @alleles = @{ $variation -> get_all_Alleles };
print( scalar( @alleles ) . "\n" );
foreach my $allele ( @alleles ){
        next unless ( defined $allele -> population );
        print( "\t\t" . $allele -> population -> name . "\t" .
                $allele -> allele . "\t" . $allele -> frequency . "\n" );
}

This program prints only "0" as it is the size of the Allele array. Can anyone please advise how to access the information that can be seen in the browser? Or point to where I am doing something wrong with this code?

Thank you very much :)

Best wishes, Daniela

perl Ensembl API variant frequency • 1.3k views
ADD COMMENT
1
Entering edit mode

which version of the API do you have?

ADD REPLY
0
Entering edit mode

Oh, gosh, sorry I did forget to add that crucial bit of information! I am using the Ensembl API v103. Thanks so much Emily!

ADD REPLY
2
Entering edit mode
3.0 years ago

Dear Emily,

I solved the problem! Thanks very much for your answer. For anyone that may stumble upon this, Ariel, a student in the lab, pointed me to this blog post:

https://www.ensembl.info/2015/06/18/1000-genomes-phase-3-frequencies-genotypes-and-ld-data/

Here, I saw that I needed to install the Perl module Bio::DB::HTS::Tabix (https://metacpan.org/pod/Bio::DB::HTS::Tabix) and add it to the $PERL5LIB env variable. The ensembl-io package I already had installed.

After that, I just added the line

$variation_adaptor -> db -> use_vcf( 1 );

Immediately after

my $variation_adaptor = $registry -> get_adaptor( 'human', 'variation', 'variation' );

And it worked! Now it printed all the variant allele frequencies for the C and T alleles in the gnomAD exomes, gnomAD genomes, 1000 Genomes Phase 3 and other populations.

Thank you very much!

Daniela

ADD COMMENT
1
Entering edit mode

Ah, fantastic. I was just about to give you that answer!

ADD REPLY

Login before adding your answer.

Traffic: 2366 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6