Question: Rrna Removal In Rna-Seq Data
gravatar for Nicolas Rosewick
7.1 years ago by
Belgium, Brussels
Nicolas Rosewick8.7k wrote:


I want to know how many reads are coming from rRNA in my data. My librairies are done with illumina TruSeq (Stranded) and RiboZero. So here my idea.

In UCSC Tables :

Select "All Tables" from the group drop-down list Select the "rmsk" table from the table drop-down list Choose "GTF" as the output format Type a filename in "output file" so your browser downloads the result Click "create" next to filter Next to "repClass," type rRNA Next to free-form query, select "OR" and type repClass = "tRNA" Click submit on that page, then get output on the main page

Now I've a gtf file with the rRNA and the tRNA

After that, I use htseq-count to extract the number of reads per rRNA and tRNA gene.

Is that ok ?



rrna • 4.3k views
ADD COMMENTlink modified 7.1 years ago by Ryan Dale4.9k • written 7.1 years ago by Nicolas Rosewick8.7k

Looks OK to me without checking the details.

ADD REPLYlink written 7.1 years ago by Sean Davis26k
gravatar for Ashutosh Pandey
7.1 years ago by
Ashutosh Pandey12k wrote:

You can also download RepBase (database of repetitive elements) and Rfam (database for different RNA species, and use it as a filter database.

ADD COMMENTlink modified 7.1 years ago • written 7.1 years ago by Ashutosh Pandey12k
gravatar for Ryan Dale
7.1 years ago by
Ryan Dale4.9k
Bethesda, MD
Ryan Dale4.9k wrote:

Your post-alignment filtering strategy should work. Another strategy is to do a separate alignment of unaligned reads to rRNA sequence. Since rRNA genes tend to be duplicated in eukaryotes, it's possible that highly multi-mapping reads are discarded (depending on the aligner and parameters you use) such that those reads don't make it into the final alignments you would use with htseq-count.

Also, if your goal is to remove rRNA reads from downstream analysis, and you use the post-alignment filtering strategy, you may want to go back and remove other alignments of that read that multi-mapped to non-rRNA regions.

I don't have a feel for how different these strategies are though -- it's possible they get you roughly the same answer in the end.

ADD COMMENTlink written 7.1 years ago by Ryan Dale4.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 844 users visited in the last hour