Question: Ensembl Perl API - How to get genotype frequencies
0
gravatar for entheologist33
3.3 years ago by
entheologist3370 wrote:

I need a way to get the genotype frequencies for different populations for various SNPs (so I need to be able to supply an RSID, then get a list of its genotype frequencies). I installed the Ensembl Perl API and found the tutorial on the variation module:

http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#ag

The problem is, in the tutorial it tells you how to get allele frequencies for SNPs, but doesn't mention any way to get genotype frequencies. For alleles, you can do this:

my $variation = $variation_adaptor->fetch_by_name('rs1333049');

my $alleles = $variation->get_all_Alleles();

foreach my $allele (@{$alleles}) {
  next unless (defined $allele->population);
  my $allele_string   = $allele->allele;
  my $frequency       = $allele->frequency || 'NA';
  my $population_name = $allele->population->name;
  printf("Allele %s has frequency: %s in population %s.\n", $allele_string, $frequency, $population_name);
}

but for info on genotypes, they only have this method:

my $variation = $variation_adaptor->fetch_by_name('rs1333049');

# OPTIONAL: uncomment this line to retrieve 1000 genomes phase 3 data also
# $variation_adaptor->db->use_vcf(1);

my $genotypes = $variation->get_all_SampleGenotypes();

foreach my $genotype (@{$genotypes}) {
  print "Sample ", $genotype->sample->name, " has genotype ", $genotype->genotype_string, "\n";
}

 

I was thinking maybe I could calculate the genotype frequency from the allele frequencies, but to do that I'd need to know the allele count wouldn't I? I don't know if the API lets you get allele counts. All that is what they show on the tutorial, maybe there are more features in there that they don't mention in the tutorial?

ADD COMMENTlink modified 12 weeks ago by zx87546.1k • written 3.3 years ago by entheologist3370
3
gravatar for EnsemblWill
3.3 years ago by
EnsemblWill560
United Kingdom
EnsemblWill560 wrote:

The method you're looking for is

$variation->get_all_PopulationGenotypes();

http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1Variation.html#a9e851dc76e253575f564d149b977c98e

http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1PopulationGenotype.html

Make sure you've followed the steps for retrieving data from VCF files if you want to retrieve 1000 Genomes Phase 3 data:

http://www.ensembl.info/blog/2015/06/18/1000-genomes-phase-3-frequencies-genotypes-and-ld-data/

ADD COMMENTlink written 3.3 years ago by EnsemblWill560
0
gravatar for entheologist33
3.3 years ago by
entheologist3370 wrote:

Thank you!!!! My god, you have no idea how much hassle you have saved me. I was going to write a spider to rip the data from DBSnp, that woulda been so tedious.

 

Another quick question: Is there a way to find out the available properties of the objects? For example, if I print $genotype->genotype, it outputs Array. I don't know what the properties of this array (well keys if it was an array, but this looks like an object to me not an array) are. In PHP I'd just do print_r($genotype->genotype); I don't know how to do this in Perl.

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by entheologist3370

The method links on this page:

http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1PopulationGenotype.html

tell you what to expect to be returned from each method call.

For example, the genotype method has the following info:

  Description: Getter for the genotype as an arrayref of alleles
  Returntype : arrayref of strings

You can also call genotype_string() to return a "|"-separated string of the composite alleles.

ADD REPLYlink written 3.2 years ago by EnsemblWill560
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 555 users visited in the last hour