How to obtain the Highest population MAF of a variation through ensembl perl api
2
0
Entering edit mode
21 months ago

Hi community,

I am trying to retrieve the highest population MAF value for a variant in Ensembl by using the Perl API. Although I see this value on the web browser (see figure), I am not able to retrieve it from the Perl API.

After some research about how this data is accessed from the web site, I have developed the following function, which receives a TranscriptVariation object and is part of a larger script:

sub get_highest_population_maf {
my $tv = shift; my$transcript = $tv->transcript; my$vf = $tv->variation_feature; my$max_alleles = $vf->get_all_highest_frequency_minor_Alleles; my$highest_population_maf = undef;
if($max_alleles && @$max_alleles) {
$highest_population_maf =$max_alleles->[0]->frequency;
}
return $highest_population_maf; }  The idea is to obtain the highest population MAF value that you can see in the web portal, however, the line$vf->get_all_highest_frequency_minor_Alleles always return an empty list although this value is visible on the web.

I am using the goat specie, and I am initializing the database adapters as follows:

# Registry configuration
my $registry = 'Bio::EnsEMBL::Registry';$registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org',
-user => 'anonymous',
);
$registry->set_reconnect_when_lost(); my$taxonomic_code = "capra_hircus";
my $transcript_adaptor =$registry->get_adaptor( $taxonomic_code, 'core', 'transcript' ); my$slice_adaptor = $registry->get_adaptor($taxonomic_code, 'Core', 'Slice' );
my $trv_adaptor =$registry->get_adaptor( $taxonomic_code, 'variation', 'transcriptvariation' ); my$vf_adaptor = $registry->get_adaptor($taxonomic_code, 'variation', 'variationfeature' );


Anybody see something wrong in my approach?

ensembl perl api variations • 384 views
1
Entering edit mode
21 months ago

Hi Ben, thanks for your answer. After setting the option you mentioned, it worked!

0
Entering edit mode
21 months ago
Ben_Ensembl ★ 1.8k

Hi Francisco,

The population frequency data is stored in VCF files. You need to set the use_vcf flag to 1 so the API knows to consider VCF files when looking up frequencies. You can use any of your variation adaptors to set the flag. For example for the variation feature adaptor: \$vf_adaptor->db->use_vcf(1);

I would also consider using the REST API, which is a language agnostic method to access the Ensembl data. You could use the GET variation/:species/:id endpoint with the pops=1 optional parameter: http://rest.ensembl.org/variation/capra_hircus/rs666529295?content-type=application/json;pops=1

This will return all allele frequencies from all populations, which you could then parse to find the highest frequency.

Best wishes

Ben Ensembl Helpdesk