How reliable are the ensembl transcript isoforms?
2.3 years ago
simplitia ▴ 80

Hi, I was reading a recent article that mentions that there were two isoform for a particular gene they were interests in. However when I went to ensembl it turns out that there were a lot more including non protein versions. The question is I'm assuming in the ensembl transripts they are all predicted correct? How reliable are they and can I somehow find out if those particular versions has been experimentally confirm? Here is a a random example for the NTRK3 gene which shows here at least 10 transcripts related to this gene. I'm wondering if CCDS can help here or uniprot which has experimental evidence.

http://www.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;g=ENSG00000140538;r=15:87876789-88256153;t=ENST00000626019

thanks

2.2 years ago

Hello

The human genome has been annotated automatically as well as manually: https://www.ensembl.org/info/genome/genebuild/index.html

There are several different 'biotypes' apart from protein coding genes: https://www.ensembl.org/info/genome/genebuild/biotypes.html

You can explore the supporting evidence that was used for annotation of a specific transcript in the browser, e.g.: http://www.ensembl.org/Homo_sapiens/Transcript/SupportingEvidence?db=core;g=ENSG00000140538;r=15:87876789-88256153;t=ENST00000394480

The MANE transcripts are flagged in the transcript table. We do suggest to work with the MANE transcript if you have to choose one. Please note that they are not avaible for all genes yet. Here is an FAQ that should help you to decide which transcript to use: http://www.ensembl.org/Help/Faq?id=276

Best wishes

Astrid

Ensembl helpdesk

2.3 years ago

Not all Ensembl transcripts are predictions. The human genome annotation process is described here. For higher confidence in the transcripts, you may want to consider those part of the CCDS set. For support for a particular, transcript, you can look at the associated flags.
Note also that not all transcripts of a gene are always expressed at the same time and place so a gene may be associated with many transcripts but in an given experimental system only one or a few may be relevant.

2.3 years ago
vkkodali_ncbi ★ 3.3k

I suggest you also check out the MANE project. This set includes only one selected transcript per gene and that transcript matches end to end including UTRs with an Ensembl transcript.

As far as evidence goes, you can find that on the flat files for RefSeqs. For example, you can see the following for NM_002826

        ##Evidence-Data-START##
Transcript exon combination :: U97276.2, SRR3476690.853351.1
[ECO:0000332]
RNAseq introns              :: single sample supports all introns
SAMEA1965299, SAMEA1966682
[ECO:0000348]
##Evidence-Data-END##