How to remove probably ribosomal RNA in mRNA-seq data derived from Illumina sequencing?
4
4
Entering edit mode
9.2 years ago
seta ★ 1.9k

Hi all,

I'm working on RNA-seq analysis of a non-model plant, I got paired-read data derived from poly-A enrichment library sequencing on Illumina platform. I would like to check the presence of possibly ribosomal RNA in my data. I would be greatly appreciated if you please let me know it's necessary as a part of quality control and doing any analysis, if yes, please let me know how I can find and remove the probably ribosomal RNA in my data? Thanks for your feedback

RNA-Seq next-gen genome • 6.7k views
ADD COMMENT
0
Entering edit mode
9.2 years ago
Danielk ▴ 640

If you have a draft genome available along with a bed of rDNA regions, you can use picards CollectRNASeqMetrics. I guess you don't though?

Also, why would you want to remove the rRNA reads?

ADD COMMENT
0
Entering edit mode

Unfortunately, I have not any available draft genome even from other plants within the related family. My focus is just on coding sequences, so I used poly-A enrichment library for sequencing.

ADD REPLY
0
Entering edit mode
9.2 years ago
Michele Busby ★ 2.2k

It is useful to know what fraction of your reads are rRNA as a basic quality metric, but if you did poly A selection (vs a ribosomal depletion method e.g. ribo zero) it's probably low and if you have enough other reads from your plant to get what you want out of it you could get away with not knowing the exact rate. The only reason people care about it as a metric is that high rRNA wastes reads.

That said, you could align your reads to whatever rRNA you have from the closest plant. Usually the rRNA is quite conserved so if you use lax alignment parameters you'll get a ballpark estimate of whether you have a problem or not.

Also, I would imagine you will be assembling your transcriptome. If you throw it through Trinity the rRNA should assemble like any other transcripts so you'll have the real rRNA sequences at the end. Then you can align back to the sequences and see what you have.

You do want to know your rRNA rate before you sequence more libraries in case your protocol needs tweaking.

ADD COMMENT
0
Entering edit mode
9.2 years ago

You could map the reads (bowtie, BWA... whatever) to a rRNA db (e.g. Silva which makes available rRNA gene sequences for long subunit and eukaryotes) and keep the leftover for further analyses.

ADD COMMENT
0
Entering edit mode
9.2 years ago
Renesh ★ 2.2k

As your working on RNA-Seq, you would merely see any ribosomal RNA. If you want to confirm this, you can map your data to RFAM database to check any ribosmoal RNA in your sequences.

ADD COMMENT
0
Entering edit mode

Thanks a lot for all comments. I will check it using rRNA databases, hope they enough conserve among all plants since I have not information even about the closet species.

ADD REPLY

Login before adding your answer.

Traffic: 2633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6