How to retrieve Ensembl phenotype data associated with Ensembl variations using Perl API
1
0
Entering edit mode
5.3 years ago
darklings ▴ 480

I have retrieved all variants in a single chromosome using the slice adapter and variation feature adaptor, then I got a file that contains variant ids and other basic information, looks like:

rs867361848     A/G     10010   10010   SNV     dbSNP
rs370048753     A/T     10014   10014   SNV     dbSNP
rs113469508     A/C/G   10015   10015   SNV     dbSNP

Then I want to get the phenotype data connective to these variants, so firstly I try to use these ids only to retrieve their phenotypes, here is my code.

#!/usr/bin/env perl
use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
  -host => 'ensembldb.ensembl.org',
  -user => 'anonymous'
);

my $var_adaptor = $registry->get_adaptor('human', 'variation','variation');
my $pf_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'phenotypefeature');

my $filename1 = '/home/records/try.txt';
my $filename2 = '/home/records/result.txt';

open( FILE1, "$filename1" ) or die("Could not open file $filename1");
open( FILE2, ">$filename2") or die("could not open file $filename2");

while (my $aline = <FILE1>){
  chomp($aline);
  my $var = $var_adaptor->fetch_by_name($aline);
  foreach my $pf (@{$pf_adaptor->fetch_all_by_Variation($var)}) {
    print FILE2 $pf->variation_names, "\t", $pf->phenotype->description, "\t", $pf->source_name,"\t";
    print FILE2 $pf->p_value,"\t" if (defined($pf->p_value));
    print FILE2 $pf->risk_allele, "\t" if (defined($pf->risk_allele));
    print FILE2 $pf->associated_gene,"\t" if (defined($pf->associated_gene));
    print FILE2 $pf->clinical_significance, "\n" if (defined($pf->clinical_significance));
  }
}


close FILE1;
close FILE2;

The "try.txt" file contains a list of variant ids, like:

rs745593600
rs201278642
rs370160198

And after this line,

my $var = $var_adaptor->fetch_by_name($aline);

if I write "print FILE2 $var->source_name,"\t",$var->variation_name,"\n"; I can get the result:

dbSNP   rs745593600
dbSNP   rs201278642
dbSNP   rs370160198

That means I can access the variation data ($var), so the problem starts from this line:

foreach my $pf (@{$pf_adaptor->fetch_all_by_Variation($var)}) {....}

The results of $pf->variation_names, $pf->phenotype->description,etc. were nothing, so how could I solve this problem?

ensembl perl API • 1.3k views
ADD COMMENT
5
Entering edit mode
5.3 years ago
Ben_Ensembl ★ 2.1k

Hi Chilam,

The script works when I run it on my machine with my own sample data. I used a test file with variants: rs991426868 0 phenotype records rs699 5 phenotype records rs2238612 1 phenotype record

I suggest that the variant name is also output ($var->name), as a check for your script output. print FILE2 $var->name, "\t"; print FILE2 $pf->variation_names, "\t", $pf->phenotype->description, "\t", $pf->source_name,"\t";

Are you sure that the variants you are looking at have phenotype information? It seems the most likely cause of no phenotype data being returned by the script, is that there is no phenotype associations for the variants in your list.

Subset of my result file name variation_names description source_name rs699 Preeclampsia, susceptibility to ClinVar rs699 Renal dysplasia ClinVar rs699 Susceptibility to progression to renal failure in IgA nephropathy ClinVar rs699 rs699 HYPERTENSION, ESSENTIAL, SUSCEPTIBILITY TO OMIM rs699 HYPERTENSION, ESSENTIAL, SUSCEPTIBILITY TO ClinVar rs2238612 rs2238612 Height GIANT

Best wishes

Ben Ensembl Helpdesk

ADD COMMENT
0
Entering edit mode

Thank you so much Ben, I will know better next time

ADD REPLY

Login before adding your answer.

Traffic: 1518 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6