Question

Annotation Human genome transcripts, Transcript Support Level meaning

0

Entering edit mode

4.8 years ago

MarVi ▴ 30

Dear all,

I have som few questions related to the annotation of non-coding transcripts. Hopefully, someone here knows some or all the answers.

Does someone know what does it mean the Transcript support level: NA in some of the transcripts in the Gencode/Ensembl annotation gtf? In their web site, they specified that those transcripts were not analyzed for several reasons transcript quality tags. Then, those non-coding transcripts ou pseudogenes might be suspect and not valid. But, how those transcripts were annotated in their very beginning? What does it mean that they weren't analyzed? How these non-coding transcripts are included in the annotation files? I found that some of them are predicted by algorithms, those algorithms are based on transcripts assemblies alignments or...? and when it says the transcript is 'manually' annotated, what manually is?

I thank in advance the answers.

gene Annotation Transcripts ensembl • 1.3k views

ADD COMMENT • link updated 4.8 years ago by Emily 23k • written 4.8 years ago by MarVi ▴ 30

score 1 · Answer 1 · 2019-07-19

As it says on the piece of documentation you linked to, TSL:NA means that the transcript was not analysed for one of the following reasons:
* pseudogene annotation, including transcribed pseudogenes
* human leukocyte antigen (HLA) transcript
* immunoglobin gene transcript
* T-cell receptor transcript
* single-exon transcript (will be included in a future version)

TSL is analysed after the transcripts are annotated, not as part of the annotation. These kinds of transcripts, which are different to most transcripts and therefore cannot be analysed in the same way are excluded.

The processes of automatic and manual annotation are all described in the documentation. This paper describes the automatic annotation in detail and this paper has more detail on the manual annotation.