Question: Ensembl Gtf/Gff File Misses Obvious Rrnas
0
gravatar for Nick Crawford
9.6 years ago by
Nick Crawford210
Philadelphia PA
Nick Crawford210 wrote:

I'm using the an ensembl gtf file vs 61 to remove rRNA from an rRNAseq dataset. Ensembl gtfs contain an rRNA annotation that makes this trivially easy to do.

import os
fin = 'mygenome.0.61.gtf'
fout = os.path.splitext(fin)[0] + 'only_rRNA.gtf'
fin = open(fin,'rU')
fout = open(fout,'w')
for count, line in enumerate(fin):
    parts = line.strip().split()
    if parts[1] != 'rRNA':
        fout.write(line)

However, after trimming my dataset of 1,980 rRNA transcripts I still find obvious rRNAs in it.

e.g.:
ENSACAG00000014849    ribosomal protein L38 (rpl38)
ENSACAG00000005015    ribosomal protein S21 (RPS21)
ENSACAG00000011604    ribosomal protein S27 (rps27)
ENSACAG00000010479    ribosomal protein S12 (Rps12)
ENSACAG00000007960    ribosomal protein S24 (Rps24)
etc.

Has anyone else had this issue? Can you suggest any work arounds. Are there better ensembl gene lists out there I could use to filter? GO terms perhaps?

gene rrna rna ensembl • 3.5k views
ADD COMMENTlink modified 9.6 years ago by Neilfws49k • written 9.6 years ago by Nick Crawford210
3
gravatar for Neilfws
9.6 years ago by
Neilfws49k
Sydney, Australia
Neilfws49k wrote:

I think you answered your own question but just to clarify. rRNA (in eukaryotes 28S + 18S, in prokaryotes 23S + 16S) are untranslated RNAs which play a structural role in the large and small ribosomal subunits. What you have there are mRNAs encoding ribosomal proteins.

ADD COMMENTlink written 9.6 years ago by Neilfws49k
1
gravatar for Nick Crawford
9.6 years ago by
Nick Crawford210
Philadelphia PA
Nick Crawford210 wrote:

Hmm.. I think I've figured out why the ribosomal proteins are showing up. They're not rRNAs (= rRNA are untranslated).

ADD COMMENTlink written 9.6 years ago by Nick Crawford210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2622 users visited in the last hour
_