Question: How to retrieve Ensembl phenotype data associated with Ensembl variations using Perl API
gravatar for darklings
2.8 years ago by
darklings90 wrote:

I have retrieved all variants in a single chromosome using the slice adapter and variation feature adaptor, then I got a file that contains variant ids and other basic information, looks like:

rs867361848     A/G     10010   10010   SNV     dbSNP
rs370048753     A/T     10014   10014   SNV     dbSNP
rs113469508     A/C/G   10015   10015   SNV     dbSNP

Then I want to get the phenotype data connective to these variants, so firstly I try to use these ids only to retrieve their phenotypes, here is my code.

#!/usr/bin/env perl
use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';
  -host => '',
  -user => 'anonymous'

my $var_adaptor = $registry->get_adaptor('human', 'variation','variation');
my $pf_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'phenotypefeature');

my $filename1 = '/home/records/try.txt';
my $filename2 = '/home/records/result.txt';

open( FILE1, "$filename1" ) or die("Could not open file $filename1");
open( FILE2, ">$filename2") or die("could not open file $filename2");

while (my $aline = <FILE1>){
  my $var = $var_adaptor->fetch_by_name($aline);
  foreach my $pf (@{$pf_adaptor->fetch_all_by_Variation($var)}) {
    print FILE2 $pf->variation_names, "\t", $pf->phenotype->description, "\t", $pf->source_name,"\t";
    print FILE2 $pf->p_value,"\t" if (defined($pf->p_value));
    print FILE2 $pf->risk_allele, "\t" if (defined($pf->risk_allele));
    print FILE2 $pf->associated_gene,"\t" if (defined($pf->associated_gene));
    print FILE2 $pf->clinical_significance, "\n" if (defined($pf->clinical_significance));

close FILE1;
close FILE2;

The "try.txt" file contains a list of variant ids, like:


And after this line,

my $var = $var_adaptor->fetch_by_name($aline);

if I write "print FILE2 $var->source_name,"\t",$var->variation_name,"\n"; I can get the result:

dbSNP   rs745593600
dbSNP   rs201278642
dbSNP   rs370160198

That means I can access the variation data ($var), so the problem starts from this line:

foreach my $pf (@{$pf_adaptor->fetch_all_by_Variation($var)}) {....}

The results of $pf->variation_names, $pf->phenotype->description,etc. were nothing, so how could I solve this problem?

perl api ensembl • 801 views
ADD COMMENTlink modified 2.8 years ago by Ben_Ensembl1.5k • written 2.8 years ago by darklings90
gravatar for Ben_Ensembl
2.8 years ago by
Ben_Ensembl1.5k wrote:

Hi Chilam,

The script works when I run it on my machine with my own sample data. I used a test file with variants: rs991426868 0 phenotype records rs699 5 phenotype records rs2238612 1 phenotype record

I suggest that the variant name is also output ($var->name), as a check for your script output. print FILE2 $var->name, "\t"; print FILE2 $pf->variation_names, "\t", $pf->phenotype->description, "\t", $pf->source_name,"\t";

Are you sure that the variants you are looking at have phenotype information? It seems the most likely cause of no phenotype data being returned by the script, is that there is no phenotype associations for the variants in your list.

Subset of my result file name variation_names description source_name rs699 Preeclampsia, susceptibility to ClinVar rs699 Renal dysplasia ClinVar rs699 Susceptibility to progression to renal failure in IgA nephropathy ClinVar rs699 rs699 HYPERTENSION, ESSENTIAL, SUSCEPTIBILITY TO OMIM rs699 HYPERTENSION, ESSENTIAL, SUSCEPTIBILITY TO ClinVar rs2238612 rs2238612 Height GIANT

Best wishes

Ben Ensembl Helpdesk

ADD COMMENTlink written 2.8 years ago by Ben_Ensembl1.5k

Thank you so much Ben, I will know better next time

ADD REPLYlink written 2.8 years ago by darklings90
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 779 users visited in the last hour