Question: misc_RNA in Ensembl
5.3 years ago by
Seville, ES
Martombo2.5k wrote:

Do you know which are the criteria used to classify a gene as "misc_RNA" by Ensembl? I couldn't find an answer on the Ensembl page describing non-coding RNA:

A few example of such genes retrieved from BioMart:;r=13:23726725-23726825;t=ENST00000384428;r=13:95351479-95351756;t=ENST00000470538;r=13:95963084-95963209;t=ENST00000411366

these examples show that they are pseudogenes. why aren't they associated to the "pseudogene" gene type? what makes them misc_RNA?

thank you!

written 5.3 years ago by Martombo2.5k
5.3 years ago by
Emily_Ensembl19k wrote:

misc_RNA is defined as any ncRNA that we can't categorise as anything else.

If you look at your genes of interest here, the word 'pseudogene' is found in the gene name ('RNA, Ro-associated Y3 pseudogene 4 [Source:HGNC Symbol;Acc:42488]'), which we lift directly from HGNC. However, these genes do not fit our definition of pseudogenes so are not classified as such. We can't change the official HGNC name, but we will only annotate genes as what we believe them to be.

written 5.3 years ago by Emily_Ensembl19k

There is some cryptic circularity going on here.

For example  I can't provenance RNY3P4 as anything.  It appears to be a RefSeq prediction of something and points back to  HGNC - but I wasn't aware they did predictions of psedogenes (they might annotate)  so where did this come from ?  

But  ENSG00000207157.1 says "No overlapping RefSeq"  clips the RefSeq down from  301 to a 101 exon ? on the basis of an Rfam model ?

But nothing from Havana/Vega in this location ?

written 5.3 years ago by cdsouthan1.8k

We got it from an RFam record. And no, no manual annotation on these guys.

written 5.3 years ago by Emily_Ensembl19k

We're into serious "what is a gene" territory here...   I might pose it as a general question.  Its getting crucial as more equivocal automated and manual annotations keep stacking up.

written 5.3 years ago by cdsouthan1.8k

Thanks for your reply! What is the difference between "type" and "locus_tag". I am looking for all rRNA sequences from GenBank file. I am a little confused why there are many 5S_rRNA are labeled as "misc_RNA"? Such as: gene 493631..493747 /gene=ENSDARG00000085618 /locus_tag="5S_rRNA" /note="5S ribosomal RNA [Source:RFAM;Acc:RF00001]" misc_RNA 493631..493747 /gene="ENSDARG00000085618" /db_xref="RFAM_trans_name:5S_rRNA.1275-201" /note="rRNA" /note="transcript_id=ENSDART00000121018" Thanks in advance.

written 2.3 years ago by AlicePsyche30
