Selecting One Ensembl Transcript Id For Each Ensembl Gene Id
2
3
Entering edit mode
9.9 years ago
jackuser1979 ▴ 890

I have transcript assembly and I have tried to annotate to human Ensembl cDNA through tblastx. I got blast top hits in Ensembl transcript id. I have converted these top hits Ensembl transcript id to Ensembl Gene id through Biomart tool. But now I have many transcripts match to same gene. If I have to consider one transcript for each gene what criteria can I choose? Whether based on transcript length or any other criteria?

For eg

Ensembl Gene ID     Ensembl Transcript ID
ENSG00000139618     ENST00000380152
ENSG00000139618     ENST00000530893
ENSG00000139618     ENST00000528762
ENSG00000139618     ENST00000470094
ENSG00000139618     ENST00000533776
ENSG00000139618     ENST00000544455


In the above example which Ensembl transcript id can I choose for the gene ENSG00000139618?

ensembl conversion • 5.4k views
3
Entering edit mode

The answer to your question lies in what you need to accomplish. Just think of it this way: if you selected one at random would that be acceptable? If not then why?, With that you have the rule by which to select your transcript.

2
Entering edit mode

why not simply keep the best hit from Blast? For each sequence you can assign only the best Ensemble transcript ID

0
Entering edit mode

Thanks for reply. How can I choose representative transcript for a gene. So, can I select one at random? or should go for selecting best transcript for a gene based on transcript length, no.of exons predicted for that transcript?

0
Entering edit mode
4.8 years ago
gaoteng ▴ 70

I recently ran in to the same problem. You can use Transcript Support Level given by Ensembl and the transcript length to decide which transcript is "canonical" (somewhat subjectively). I wrote a script that achieves this: https://github.com/teng-gao/genomics_utils

1
Entering edit mode

To enlarge your answer, the APPRIS annotation is also a very good indicator to select transcripts with several isoforms. :)

0
Entering edit mode
4.8 years ago

There is a (relatively new) Ensembl training video on YouTube on the topic of 'Choosing a transcript':