Question: Retrieve protein domain coordinates given a protein/transcript ID
3
gravatar for Lalla
3.4 years ago by
Lalla40
Germany
Lalla40 wrote:

Hi!

I have a list of ensembl transcript and proten IDs and I want to find the coordinates of the protein domains of these proteins. Is there a tool which allows me to find this information given the protein or transcript IDs as an input? My ultimate goal is to find if the exons that I am studying are part of a domain and I was planning to do that by transform the exon genomic coordinates into protein coordinates and then use findOverlap in GenomicFeatures (bioconductor) between exon protein coordinates and domain protein coordinates. This is probably a stupid question, but I research and I couldn't find a solution.

I know that I can see the domain coordinates in BioMart looking at each protein in the browser but is there a way to just download these information for an entire list of protein IDs? I think that using Perl API is possible to do so, but I find the relative documentation really difficult to understand, therefore I would like to avoid using Perl API.

I would also like to find if exons which are not part of a protein domain are associated to regulatory functions, basically using the same approach described above.

P.S. I'd prefer to avoid using UCSC

Thanks in advance!

 

 

ADD COMMENTlink modified 8 months ago by Jean-Karim Heriche16k • written 3.4 years ago by Lalla40
1
gravatar for Jean-Karim Heriche
8 months ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche16k wrote:

The perl API is the way to go in this case. See the documentation for the ProteinFeature object.

 my $registry = "Bio::EnsEMBL::Registry";
 $registry->load_all("$ENV{'HOME'}/.ensembl_init");      
 my $dba = $registry->get_DBAdaptor("Homo sapiens","core");
 # Get protein from Ensembl
 my $translation_adaptor = $dba->get_TranslationAdaptor();
 my $Ensprot = $translation_adaptor->fetch_by_stable_id($EnsprotID);
 # Get PFAM domains on the protein
 my @domains = @{$Ensprot->get_all_ProteinFeatures('pfam')};
 # Get coordinates relative to the slice the protein is on (usually the chromosome)
 foreach my $domain (@domains) {
    $domain = $domain->transform('chromosome'); # make sure we're using chromosome coordinates
    my $id = $domain->display_id();
    my $strand = $domain->strand();
    my $start = $domain->start();
    my $end = $domain->end();
  }
ADD COMMENTlink written 8 months ago by Jean-Karim Heriche16k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1114 users visited in the last hour