Question: Retrieve protein domain coordinates given a protein/transcript ID
gravatar for Lalla
5.6 years ago by
Lalla40 wrote:


I have a list of ensembl transcript and proten IDs and I want to find the coordinates of the protein domains of these proteins. Is there a tool which allows me to find this information given the protein or transcript IDs as an input? My ultimate goal is to find if the exons that I am studying are part of a domain and I was planning to do that by transform the exon genomic coordinates into protein coordinates and then use findOverlap in GenomicFeatures (bioconductor) between exon protein coordinates and domain protein coordinates. This is probably a stupid question, but I research and I couldn't find a solution.

I know that I can see the domain coordinates in BioMart looking at each protein in the browser but is there a way to just download these information for an entire list of protein IDs? I think that using Perl API is possible to do so, but I find the relative documentation really difficult to understand, therefore I would like to avoid using Perl API.

I would also like to find if exons which are not part of a protein domain are associated to regulatory functions, basically using the same approach described above.

P.S. I'd prefer to avoid using UCSC

Thanks in advance!



ADD COMMENTlink modified 2.9 years ago by Jean-Karim Heriche24k • written 5.6 years ago by Lalla40
gravatar for Jean-Karim Heriche
2.9 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche24k wrote:

The perl API is the way to go in this case. See the documentation for the ProteinFeature object.

 my $registry = "Bio::EnsEMBL::Registry";
 my $dba = $registry->get_DBAdaptor("Homo sapiens","core");
 # Get protein from Ensembl
 my $translation_adaptor = $dba->get_TranslationAdaptor();
 my $Ensprot = $translation_adaptor->fetch_by_stable_id($EnsprotID);
 # Get PFAM domains on the protein
 my @domains = @{$Ensprot->get_all_ProteinFeatures('pfam')};
 # Get coordinates relative to the slice the protein is on (usually the chromosome)
 foreach my $domain (@domains) {
    $domain = $domain->transform('chromosome'); # make sure we're using chromosome coordinates
    my $id = $domain->display_id();
    my $strand = $domain->strand();
    my $start = $domain->start();
    my $end = $domain->end();
ADD COMMENTlink written 2.9 years ago by Jean-Karim Heriche24k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1029 users visited in the last hour