How to deal with RNA-SEQ samples with different number of raw reads?
1
0
Entering edit mode
6.9 years ago
Whoknows ▴ 960

Hi all,

I have a human RNA-SEQ project consists 4 samples; 2 condition each of them has 2 replicates. Three of four samples contain 20 millions paired-end reads however one of them consist 50 millions paired-end reads. Read length is same for all samples, 150bp.

I'd like to know, Is this reads difference will affect on my final result or not?

Is there any way to normalized data for DE step based on their library size?

Thanks.

RNA-Seq alignment • 2.5k views
ADD COMMENT
0
Entering edit mode
6.9 years ago

You only tend to run into problems once you get ~10x differences in depth. All common RNAseq tools (DESeq2, edgeR, limma/voom, etc.) will already properly normalize your data.

ADD COMMENT
0
Entering edit mode

thanks devon, so 30 millions extra reads is not a big deal for common analysis pipeline, right?

ADD REPLY
0
Entering edit mode

Correct, though it's not so much the difference between them as their ratio that ends up being a problem.

ADD REPLY
0
Entering edit mode

Hello,

I have faced the exact problem but greater than it has mentioned above. My project is included 2 conditions (tumor vs. normal) with 21 and 5 replicate respectively. 4 samples in tumor condition contain 250-300 million reads however other samples consist of 60-100 million reads on average. Since the results show significant differences between these samples it seems DESeq2 normalization does not help. Is there any suggestion for fixing this problem?

Thanks.

Reza

ADD REPLY
0
Entering edit mode

This should be its own question.

ADD REPLY
0
Entering edit mode

Thanks. I submitted it as a new question.

ADD REPLY

Login before adding your answer.

Traffic: 2168 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6