Question: Ensembl API for retrieving human gene families
1
gravatar for David.shaw
3.1 years ago by
David.shaw10
European Union
David.shaw10 wrote:

I am trying to retrieve gene paralogs and members of gene families given a certain input gene. For example, If my input is TMEM110 I would like all the TMEM genes that are paralogues and members of its gene family. (e.g. TMEM*)

Currently, the script below will return the gene family for all species rather than just human. When I change 'Multi' to Human the script breaks. Also it outputs protein IDs, but I would like it to return ENSEMBL gene ids like the input (ENSG00000139618).

Any help would be appreciated!

use strict;
    use warnings;

    use Bio::EnsEMBL::Registry;

    ## Load the registry automatically
    my $reg = "Bio::EnsEMBL::Registry";
    $reg->load_registry_from_url('mysql://anonymous@ensembldb.ensembl.org');


    ## Get the compara genemember adaptor
    my $gene_member_adaptor = $reg->get_adaptor("Multi", "compara", "GeneMember");

    ## Get the compara family adaptor
    my $family_adaptor = $reg->get_adaptor("Multi", "compara", "Family");


    ## Get the compara member
    my $gene_member = $gene_member_adaptor->fetch_by_source_stable_id("ENSEMBLGENE", "ENSG00000139618");

    ## Get all the families
    my $all_families = $family_adaptor->fetch_all_by_Member($gene_member);

    ## For each family
    foreach my $this_family (@{$all_families}) {
      print $this_family->description(), " (description score = ", $this_family->description_score(), ")\n";

      ## print the members in this family
      my $all_members = $this_family->get_all_Members();
      foreach my $this_member (@{$all_members}) {
        print $this_member->source_name(), " ", $this_member->stable_id(), " (", $this_member->taxon()->name(), ")\n";
      }
      print "\n";
    }
perl api ensembl gene • 930 views
ADD COMMENTlink modified 3.1 years ago by Emily_Ensembl18k • written 3.1 years ago by David.shaw10
1
gravatar for Emily_Ensembl
3.1 years ago by
Emily_Ensembl18k
EMBL-EBI
Emily_Ensembl18k wrote:

The gene family adaptor gets all members of the family and doesn't discriminate on species. You want to use the homology adaptor instead, then use the method_link_species_set adaptor to get paralogues. There's more info in this section of the online course.

ADD COMMENTlink written 3.1 years ago by Emily_Ensembl18k

Is there a way of getting similarly named genes? For example, TMEM110 has 1 paralogue but there are many members of TMEM. Whilst I can use regular expressions for this example, for my actual application I don't want to have regular expressions for all genes i.e:

COL1A1 -> COL

KLF11 > KLF*

etc. Depending on what the user inputs into the script (which will be many genes one after another)

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by David.shaw10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 747 users visited in the last hour