I need a way to get the genotype frequencies for different populations for various SNPs (so I need to be able to supply an RSID, then get a list of its genotype frequencies). I installed the Ensembl Perl API and found the tutorial on the variation module: http://www.ensembl.org/info/docs/api/variation/variation_tutorial.html#ag
The problem is, in the tutorial it tells you how to get allele frequencies for SNPs, but doesn't mention any way to get genotype frequencies. For alleles, you can do this:
my $variation = $variation_adaptor->fetch_by_name('rs1333049');
my $alleles = $variation->get_all_Alleles();
foreach my $allele (@{$alleles}) {
next unless (defined $allele->population);
my $allele_string = $allele->allele;
my $frequency = $allele->frequency || 'NA';
my $population_name = $allele->population->name;
printf("Allele %s has frequency: %s in population %s.\n", $allele_string, $frequency, $population_name);
}
but for info on genotypes, they only have this method:
my $variation = $variation_adaptor->fetch_by_name('rs1333049');
# OPTIONAL: uncomment this line to retrieve 1000 genomes phase 3 data also
# $variation_adaptor->db->use_vcf(1);
my $genotypes = $variation->get_all_SampleGenotypes();
foreach my $genotype (@{$genotypes}) {
print "Sample ", $genotype->sample->name, " has genotype ", $genotype->genotype_string, "\n";
}
I was thinking maybe I could calculate the genotype frequency from the allele frequencies, but to do that I'd need to know the allele count wouldn't I? I don't know if the API lets you get allele counts. All that is what they show on the tutorial, maybe there are more features in there that they don't mention in the tutorial?
Thank you! My god, you have no idea how much hassle you have saved me. I was going to write a spider to rip the data from DBSnp, that would've been so tedious.
Another quick question: Is there a way to find out the available properties of the objects? For example, if I print
$genotype->genotype
, it outputs Array. I don't know what the properties of this array (well keys if it was an array, but this looks like an object to me not an array) are. In PHP I'd just doprint_r($genotype->genotype);
I don't know how to do this in Perl.The method links on this page tell you what to expect to be returned from each method call. For example, the genotype method has the following info:
You can also call
genotype_string()
to return a|
-separated string of the composite alleles.