How to obtain the Highest population MAF of a variation through ensembl perl api
2
0
Entering edit mode
4.2 years ago

Hi community,

I am trying to retrieve the highest population MAF value for a variant in Ensembl by using the Perl API. Although I see this value on the web browser (see figure), I am not able to retrieve it from the Perl API. Ensembl screenshot

After some research about how this data is accessed from the web site, I have developed the following function, which receives a TranscriptVariation object and is part of a larger script:

sub get_highest_population_maf {
    my $tv = shift;
    my $transcript = $tv->transcript;
    my $vf = $tv->variation_feature;
    my $max_alleles = $vf->get_all_highest_frequency_minor_Alleles;
    my $highest_population_maf = undef;
    if($max_alleles && @$max_alleles) {
        $highest_population_maf = $max_alleles->[0]->frequency;
    } 
    return $highest_population_maf;
}

The idea is to obtain the highest population MAF value that you can see in the web portal, however, the line $vf->get_all_highest_frequency_minor_Alleles always return an empty list although this value is visible on the web.

I am using the goat specie, and I am initializing the database adapters as follows:

# Registry configuration
my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
    -host => 'ensembldb.ensembl.org',
    -user => 'anonymous',
);
$registry->set_reconnect_when_lost();

my $taxonomic_code = "capra_hircus";
my $transcript_adaptor = $registry->get_adaptor( $taxonomic_code, 'core', 'transcript' );
my $slice_adaptor = $registry->get_adaptor( $taxonomic_code, 'Core', 'Slice' );
my $trv_adaptor = $registry->get_adaptor( $taxonomic_code, 'variation', 'transcriptvariation' );
my $vf_adaptor = $registry->get_adaptor( $taxonomic_code, 'variation', 'variationfeature' );

Anybody see something wrong in my approach?

Thanks in advance. Best wishes, Francisco Abad.

ensembl perl api variations • 950 views
ADD COMMENT
1
Entering edit mode
4.1 years ago

Hi Ben, thanks for your answer. After setting the option you mentioned, it worked!

Best regards, Francisco Abad.

ADD COMMENT
0
Entering edit mode
4.1 years ago
Ben_Ensembl ★ 2.4k

Hi Francisco,

The population frequency data is stored in VCF files. You need to set the use_vcf flag to 1 so the API knows to consider VCF files when looking up frequencies. You can use any of your variation adaptors to set the flag. For example for the variation feature adaptor: $vf_adaptor->db->use_vcf(1);

There is more information in the following documentation: http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#ag

I would also consider using the REST API, which is a language agnostic method to access the Ensembl data. You could use the GET variation/:species/:id endpoint with the pops=1 optional parameter: http://rest.ensembl.org/variation/capra_hircus/rs666529295?content-type=application/json;pops=1

This will return all allele frequencies from all populations, which you could then parse to find the highest frequency.

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT

Login before adding your answer.

Traffic: 1935 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6