Total RNA-Seq vs mRNA-Seq
2
2
Entering edit mode
2.5 years ago
Pin.Bioinf ▴ 300

Hello, I have read that 30-50M reads mapped per sample are the general optimal number of reads mapped needed to do a DE expression analysis for mRNA-Seq. What would be the minimum for TOTAL RNA-Seq? Is it a lot more?

Thank you

RNA-Seq • 3.0k views
0
Entering edit mode

I do not understand a lot about biology as I am a computer scientist. My colleague asked me if she could add more samples to the run (which would diminish the amount of reads per sample), so I am asking the minimum amount of M reads needed in both cases: total rnaseq and mrnaseq. So what I understood is, if we do ribosomal depletion for total rnaseq then it will be very similar to doing mrnaseq, right? then the minimum M reads needed will be similar. But if we dont do rrna depletion with total rnaseq then we will need 20 times the M reads needed for mrnaseq (as an approximation) right?

0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

1
Entering edit mode
2.5 years ago

As a rough estimate: RNA is for 95% rRNA. If you are not depleting these in your TOTAL RNA-seq, and want to get the same number of reads on your 'interesting' mRNA then you can multiply that number of reads by 20.

0
Entering edit mode

Thank you WouterDeCoster, so you mean if I do not deplete rrna I should need 600 -1000 M reads per sample? And what if I do ribosomal depletion first?

1
Entering edit mode

To be fair I don't think people actually do "TOTAL" RNA-seq. If you do ribosomal depletion (efficiently) then you are roughly doing the same as polyA enrichment, with the exception that you'll get a couple of lowly expressed non-polyA-tailed lnc-RNA, which don't change the end result much.

0
Entering edit mode

What is the final goal? Do you want to find differentially expressed, but non-polyadenylated genes/transcripts? Is there anything wrong with polyA-enrichment?

1
Entering edit mode
2.5 years ago

This is a two part answer:

On the topic of total RNA

As mentioned by WouterDeCoster 95% of cellular RNA is rRNA - therefore RNA library preperation always either do:

1. rRNA depletion
2. poly-A selection

to enrich for the RNA of interest.

This is build into the library preparation protocols so you don't need to think about it. You just tell the sequencing center which one you want (if you don't say anything I would guess 95% would do poly-A selection).

Please note that often rRNA depletion is refereed to as total-RNA.

On the topic of number reads

1. To do a gene differential expression analysis you need 5-10e6 reads
2. The number of independent biological replicates is the major determining factor in the power you have - you need at least 3 replicates in each condition!
3. If you want to do a transcript level analysis, enabling analysis of amongst other isoform switches, you need to to sequence deeper - 30-50e6 paried-end reads.

You can find more information about good practices in RNA-seq analysis here. Analysis of isoform switches - such as the analysis presented here can be done with my R-package IsoformSwitchAnalyzeR.

1
Entering edit mode

I then to agree with what is stated above.

few remarks though: (on the number of reads part)

1. we usually go for ~15M reads for a typical experiment in Arabidopsis (this number is also somewhat related to genome/transcriptome sizes and number of expected expressed genes)
2. fully agree here! if you want to spend extra money you're better of doing a few more replicates, rather then unnecessary depth
3. indeed for transcript level analysis (all the above is on gene (==locus) level ) you need more depth, 30-50M will be more than enough, we typically advise our wetlab to go for 20M-30M.

Again, all this for an Arabidopsis setting