Question: Seq. depth for DE analysis
0
gravatar for grant.hovhannisyan
22 months ago by
grant.hovhannisyan1.6k wrote:

Hi Biostars,

I have RNAseq data for time-course experiment with 5 time points with three replicate in each.

Some replicates have twice lower sequencing depth than others. I want to check if these low sequencing depth replicates will affect somehow DE analysis. For this, I have normalized raw counts in replicates by (library size*10e6) and did plotPCA. On PCA I see good clustering of samples by time points, and replicates with lowers seq. depth don't appear to be outliers. The same trend is for RLE plots.

Is this enough to assume that differences is sequencing depth will not affect DE analysis? I will use Deseq2 for DE.

Thanks

ADD COMMENTlink modified 22 months ago by Renesh1.6k • written 22 months ago by grant.hovhannisyan1.6k
1
gravatar for h.mon
22 months ago by
h.mon26k
Brazil
h.mon26k wrote:

I think PCA alone is a not enough, although it is one good diagnostic. Why some samples had such lower depth? The cause may point to potential problems you will have with your analysis.

Another good diagnostic is a saturation plot, to check if all libraries have appropriate depth of sequencing.

ADD COMMENTlink written 22 months ago by h.mon26k

Thanks for suggestion! Some samples have lower depth because this is dual-RNAseq experiment, which means that in our case human RNA was mixed with yeast RNA, and then library prep was made. And sometimes it is hard to control the proportion between human and yeast RNA in the pooled sample. But in all replicates the parameters as mapping rate, phred scores and other QC parameters are high.

ADD REPLYlink written 22 months ago by grant.hovhannisyan1.6k
0
gravatar for Renesh
22 months ago by
Renesh1.6k
United States
Renesh1.6k wrote:

Sequencing depth surely will affect your DE analysis and will give you inflated statistical significance. Higher sequence depth library will obviously produce more reads for equally expressed mRNAs than lower depth library.

To avoid your DE analysis with unequal sequence depth library, you must normalize your reads counts to RPKM/FPKM/TPM units. These units normalize your data to per million mapped scaling factor which corrects for the difference in sequencing depth among different libraries. PCA alone will not solve your issue.

http://bioinfogeek.over-blog.com/2017/09/gene-expression-units-explained-rpm-rpkm-fpkm-and-tpm.html

ADD COMMENTlink modified 22 months ago • written 22 months ago by Renesh1.6k

Thanks for suggestions, but I don't think performing DE with normalized units is a good idea (especially with Deseq2 or similar software), since they require count data and perform their own internal normalization. My concern was whether even after normalization samples with low depth will bi biased somehow.

ADD REPLYlink written 22 months ago by grant.hovhannisyan1.6k

If your library have different sequencing depth, it is recommended to have RPKM/FPKM/TPM normalization. These normalized units will be uniform across libraries and will give you reliable analysis. You can use cuffdiff for expression analysis. You can also use R packages for DE without normalizing these RPKM counts.

DEseq2 is designed to account for different library sizes (http://www.bioconductor.org/help/workflows/rnaseqGene/).

ADD REPLYlink written 22 months ago by Renesh1.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1276 users visited in the last hour