Question: rRNA quantification: which is the best way?
0
gravatar for A. Domingues
3.1 years ago by
A. Domingues2.1k
Dresden, Germany
A. Domingues2.1k wrote:

My goal was to quantify how many ribosomal reads (5.8S) there were in my library, and use these to normalize gene expression. It makes sense in my experiment. I took 2 approaches:

1 - read count

Affter mapping with Tophat, no multi-mappers allowed, I used featureCounts to count reads mapping to features, and tallied up the rRNA reads as those mapping to "LSU-rRNA_Hsa".

Using this method, there are between 10^5 to 10^6 rRNA reads for each sample.

2 - mapping to rRNA

Following these instructions, I mapped reads directly to the rRNA sequences provided in the iGenomes bundle (AbundantSequences/hum5SrDNA) using bowtie.

Using this method, there are between 10^3 to 10^4 mapped rRNA reads for each sample. This is quite a difference!


Questions:

  1. Why such a difference between methods? I actually expected to get more reads when mapping directly to sequence since the multi-mapping issue is avoided.

  2. Is any of these methods adequate for the purpose?

  3. Which alternative method would you suggest?

I might actually not use this data, for a number of reason, but I am curios about what happened here.

mapping featurecounts rrna • 1.7k views
ADD COMMENTlink modified 3.1 years ago by Charles Plessy2.7k • written 3.1 years ago by A. Domingues2.1k
1

I would say that the simple and straightforward answer is that the two reference sequences that you are aligning and counting against are not similar to one another.

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by Istvan Albert ♦♦ 81k
1
gravatar for Charles Plessy
3.1 years ago by
Charles Plessy2.7k
Japan
Charles Plessy2.7k wrote:

I usually filter out rRNA reads using the tools TagDust 2 and its -ref option, to which I give a FASTA file containing either the rRNA sequences, or the whole rDNA locus (like U13369 for humans). To my knowledge, the human genome assembly _hg38_ does not contain the rDNA loci, which are challenging to assemble.

LSU-rRNA_Hsa is the name of a rRNA-derived repeated element. While rRNA reads will tend to align there if they are not filtered out first, I think that counting alignments to these regions is not a good way to quantify rRNA reads.

Conversely, the name _hum5SrDNA_ suggests that it does not contain all the rRNA sequences, but only the one of the 5S rRNA...

ADD COMMENTlink written 3.1 years ago by Charles Plessy2.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 630 users visited in the last hour