What does a gene's "aligned length" represent?
1
0
Entering edit mode
8 months ago
Ethan Lee • 0

What does a gene's "aligned length" represent in the NCBI gene database? I can understand that CDS length represents the length of the coding sequence (the number of amino acid residues + 1 and then multiplied by 3), but I really can't understand what aligned length is.

alignment • 1.9k views
ADD COMMENT
1
Entering edit mode

Can you show us an example of where you see this?

ADD REPLY
0
Entering edit mode

Sure, I'll add a link. SNCA synuclein alpha Homo sapiens (human) If you put the mouse pointer on the transcript map of this gene (without clicking), you can see the "aligned length" information mentioned in my question. It would help me a lot if you could answer what this means. Thanks!

ADD REPLY
1
Entering edit mode

It's the length of the exons, compared to the genomic length which would be exon+intron, called the span in the hover menu.

ADD REPLY
1
Entering edit mode

It is the length of the transcript

Tooltip

Entry in GenBank/Nuccore

nuccore

Corresponding entry in Ensembl

Ensembl

ADD REPLY
0
Entering edit mode

I see! Thanks a lot!

ADD REPLY
3
Entering edit mode
8 months ago
vkkodali_ncbi ★ 3.7k

It is the length of the transcript

In this case, yes. But really, it is the length of the transcript that aligns to the genome. For example, take a look at the human gene CDKN1C and the transcript NM_000076.2. enter image description here

As you can see, the aligned length and sequence length are different here. If you look at the sequence of NM_000076.2 you will notice that this transcript has a polyA tail that does not _align_ to the genome. Hence the difference.

ADD COMMENT
0
Entering edit mode

Thanks for the clarification. I guess would be somewhat dependent on submissions people have done. There being no poly-A in the genome after the end of alignable sequence.

ADD REPLY
2
Entering edit mode

Manually curated RefSeq transcripts are based on a cDNA sequences that were submitted to GenBank. If one such sequence has a polyA tail, a curator can choose to retain it in the final RefSeq. Not all RefSeqs will have a polyA tail though.

The aligned sequence length and the transcript length can also differ when the RefSeq transcript sequence and the genome sequence are not completely identical.

ADD REPLY
0
Entering edit mode

Thanks a lot! That's very clear!

ADD REPLY

Login before adding your answer.

Traffic: 2300 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6