Noncoding Rnas Having Cds In Ensembl?
2
0
Entering edit mode
11.5 years ago
user ▴ 940

In the standard ensGene table obtained from UCSC for the mm9 genome, the noncoding RNA gene MALAT1 has the following annotation:

629    ENSMUST00000172812    chr19    -    5795689    5802671    5802671    5802671    1    5795689,    5802671,    0    ENSMUSG00000092341    none    none    -1,
629    ENSMUST00000173314    chr19    -    5795690    5797464    5797464    5797464    2    5795690,5797003,    5795785,5797464,    0    ENSMUSG00000092341    none    none    -1,-1,
629    ENSMUST00000174808    chr19    -    5795693    5796952    5796952    5796952    2    5795693,5796388,    5795785,5796952,    0    ENSMUSG00000092341    none    none    -1,-1,
629    ENSMUST00000173499    chr19    -    5801915    5802671    5802671    5802671    2    5801915,5802237,    5801966,5802671,    0    ENSMUSG00000092341    none    none    -1,-1,
629    ENSMUST00000173523    chr19    -    5801942    5802640    5802640    5802640    2    5801942,5802065,    5802023,5802640,    0    ENSMUSG00000092341    none    none    -1,-1,

MALAT1 is supposed to be a nuclear noncoding RNA. How come it has an annotated CDS start and end fields?

rna ensembl annotation genome • 2.5k views
ADD COMMENT
0
Entering edit mode

There are surprises in life. There are some examples of ncRNAs in Bacteria that have an ORF like RNAIII of S. auerus or SgrS of E coli

ADD REPLY
1
Entering edit mode
11.5 years ago
Laura ★ 1.8k

The Ensembl Annotation doesn't have a cds

http://www.ensembl.org/Mus_musculus/Gene/Summary?g=ENSMUSG00000092341;r=19:5795690-5802671

ADD COMMENT
0
Entering edit mode
11.5 years ago
biorepine ★ 1.5k

It has been reported that many lncRNA produce short peptide though the over all transcript is non-coding. It might be the result of short ORF hidden inside MALAT1. I can not clearly distinguish your CDs start and end. I suggest you to calculate the difference between end and start and if it is more than 200 bp in length then some thing could be wrong.

ADD COMMENT
0
Entering edit mode

the CDS here is exactly 1 nucleotide long. It makes no sense, the cds start end are: 5802671, 5802671

ADD REPLY
0
Entering edit mode

This is a very late post, just to clarify a few things, for anyone stumbling across this post from Google (like me)...

EnsGene files are in UCSC refGene format, which uses 0-based coords, so the CDS has length 0 not 1. The format convention always specifies CDS start, end even if no CDS exists. If no CDS, then the reported start, end will both be the final position of the transcript, in this case 5802671. Also, columns 14-15 indicate status of CDS start, end, and here they both have value "none", indicating no CDS.

ADD REPLY

Login before adding your answer.

Traffic: 2989 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6