Question: How to remove probably ribosomal RNA in mRNA-seq data derived from Illumina sequencing?
4
gravatar for seta
4.1 years ago by
seta1.1k
Sweden
seta1.1k wrote:

Hi all,

I'm working on RNA-seq analysis of a non-model plant, I got paired-read data derived from poly-A enrichment library sequencing on Illumina platform. I would like to check the presence of possibly ribosomal RNA in my data. I would be greatly appreciated if you please let me know it's necessary as a part of quality control and doing any analysis, if yes, please let me know how I can find and remove the probably ribosomal RNA in my data? Thanks for your feedback

rna-seq next-gen genome • 4.2k views
ADD COMMENTlink modified 4.1 years ago by Renesh1.6k • written 4.1 years ago by seta1.1k
0
gravatar for Danielk
4.1 years ago by
Danielk560
Karolinska Institutet, Stockholm, Sweden
Danielk560 wrote:

If you have a draft genome available along with a bed of rDNA regions, you can use picards CollectRNASeqMetrics (http://broadinstitute.github.io/picard/command-line-overview.html#CollectRnaSeqMetrics). I guess you don't though? 

Also, why would you want to remove the rRNA reads? 

 

ADD COMMENTlink written 4.1 years ago by Danielk560

Unfortunately, I have not any available draft genome even from other plants within the related family. My focus is just on coding sequences, so I used poly-A enrichment library for sequencing. 

ADD REPLYlink written 4.1 years ago by seta1.1k
0
gravatar for Michele Busby
4.1 years ago by
Michele Busby1.9k
United States
Michele Busby1.9k wrote:

It is useful to know what fraction of your reads are rRNA as a basic quality metric, but if you did poly A selection (vs a ribosomal depletion method e.g. ribo zero) it's probably low and if you have enough other reads from your plant to get what you want out of it you could get away with not knowing the exact rate.  The only reason people care about it as a metric is that high rRNA wastes reads.

That said, you could align your reads to whatever rRNA you have from the closest plant.  Usually the rRNA is quite conserved so if you use lax alignment parameters you'll get a ballpark estimate of whether you have a problem or not.

Also, I would imagine you will be assembling your transcriptome.  If you throw it through Trinity the rRNA should assemble like any other transcripts so you'll have the real rRNA sequences at the end.  Then you can align back to the sequences and see what you have.  

You do want to know your rRNA rate before you sequence more libraries in case your protocol needs tweaking.

ADD COMMENTlink written 4.1 years ago by Michele Busby1.9k
0
gravatar for Manu Prestat
4.1 years ago by
Manu Prestat3.9k
Marseille, France
Manu Prestat3.9k wrote:

You could map the reads (bowtie, BWA... whatever) to a rRNA db (e.g. Silva which makes available rRNA gene sequences for long subunit and eukaryotes) and keep the leftover for further analyses.

ADD COMMENTlink written 4.1 years ago by Manu Prestat3.9k
0
gravatar for Renesh
4.1 years ago by
Renesh1.6k
United States
Renesh1.6k wrote:

As your working on RNA-Seq, you would merely see any ribosomal RNA. If you want to confirm this, you can map your data to RFAM database to check any ribosmoal RNA in your sequences.

ADD COMMENTlink written 4.1 years ago by Renesh1.6k

Thanks a lot for all comments. I will check it using rRNA databases, hope they enough conserve among all plants since I have not information even about the closet species.

ADD REPLYlink written 4.1 years ago by seta1.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2088 users visited in the last hour