I am merging ncRNA annotations for rat (rnor6) and I found that some lncRNAs (more than 200nts) in NCBI have a CDS. some examples are: LOC100910905, LOC102550759. Why do they have a CDS if they are defined as "ncRNA" ?
I am merging ncRNA annotations for rat (rnor6) and I found that some lncRNAs (more than 200nts) in NCBI have a CDS. some examples are: LOC100910905, LOC102550759. Why do they have a CDS if they are defined as "ncRNA" ?
Hi,
theoretically, ncRNA is RNA that has been expressed but not translated. Therefore, it is normal to have exonic and intronic regions. Based on its genomic location it may overlap other annotated regions, such as UTRs, introns, and CDS PMID:25233092
I checked the first three entries of your data and I did not see them overlapping known CDS (vs Ensemble & Rat mRNA). So may be you need to check how Gnomon prediction goes with known ncRNA, e.g. lincRNA and miRNA.
This link might help you too lncRNA
hth
Hi,
Thanks for the fast reply.
The ncRNAs may or may not overlap known protein CDS. This is not the question. My problem is that in the NCBI gtf file, some ncRNAs are annotated with exons and CDS regions. If you look up these genes in the NCBI-gene you see that it is classified as "ncRNA" in the gene type but when you click on Genbank, you see the genomic locations for mRNA and CDS. It seems like part of the sequence of this ncRNA codes for protein. I must be misunderstanding something. =(
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
have you checked the strand information?
They are in the same strand. Here is an extract from the gtf:
16 Gnomon exon 21402863 21403031 . + . gene_id "LOC102550759"; gene_name "LOC102550759"; p_id "P13265"; transcript_id "XM_008771189.1"; tss_id "TSS33612";
16 Gnomon CDS 21402957 21403031 . + 0 gene_id "LOC102550759"; gene_name "LOC102550759"; p_id "P13265"; transcript_id "XM_008771189.1"; tss_id "TSS33612";
16 Gnomon CDS 21421987 21422217 . + 0 gene_id "LOC102550759"; gene_name "LOC102550759"; p_id "P13265"; transcript_id "XM_008771189.1"; tss_id "TSS33612";
16 Gnomon exon 21421987 21422375 . + . gene_id "LOC102550759"; gene_name "LOC102550759"; p_id "P13265"; transcript_id "XM_008771189.1"; tss_id "TSS33612";