Question: basic statistics books to understand DESeq2
0
10 months ago by
Bioinfonext200
Korea
Bioinfonext200 wrote:

Hi,

I am from biological background, Could you please advise me to basic biostatistics books that can help me to understand the DESeq2 tutorial?

Kind Regards Bioinfonext

R • 473 views
modified 10 months ago by mmfansler350 • written 10 months ago by Bioinfonext200

Have you read the paper and the vignette?

Yes, I need to understand basic statistics terms used in DESeq2 like:

``````1) I am always confused with how to design the formula: in case of one factor, two factors and three factors

2) When exactly need to use interaction terms or group all factor into one?

3) what is beta prior?

4) what is shrink log fold changes? I only know log fold change.

5) Dispersion is generally estimation of variance but what is shrinkage?
``````

Kind Regards Bioinfonext

Try this one.

Beginner’s guide to using the DESeq2 package

Michael Love1∗, Simon Anders2, Wolfgang Huber2

https://bioc.ism.ac.jp/packages/2.14/bioc/vignettes/DESeq2/inst/doc/beginner.pdf

I hope, it is not a double answer

Thank you very much.

The larger a dispersion value, the larger the difference in expression has to be in order for a gene to be called DE. As the number of replicates for each condition increases, the amount of dispersion shrinkage per gene decreases as we are then able to estimate the dispersion parameter from the data without shrinkage.

DESeq2 Dispersion Shrinkage - more samples is better?

https://support.bioconductor.org/p/100764/

3
10 months ago by
mmfansler350
MSKCC | New York, NY
mmfansler350 wrote:

Modern Statistics for Modern Biology by Holmes and Huber covers DESeq2 in Chapter 8 and tries to provide most of the statistical background to get there with the earlier chapters. It does assume some basic facility with R.

There is currently an online reading group, which meets weekly to present and discuss the chapters and exercises. Chapter 8 is scheduled for discussion on May 29th, 2019.

Thank you very much to you all. I am not able to understand few things:

1) Which normalized method DESeq used at the step by default: vst or rlog

``````dds <- DESeq(dds)
``````

2) what are the further step after getting res table:

``````> res.root.soil= results(diagdds, contrast = c("Tissue", "Root", "Soil"), alpha = 0.1)

> summary(res.root.soil)

out of 30438 with nonzero total read count

LFC > 0 (up)     : 10, 0.033%

LFC < 0 (down)   : 29584, 97%

outliers [1]     : 0, 0%

low counts [2]   : 5664, 19%

(mean count < 0)

[1] see 'cooksCutoff' argument of ?results

[2] see 'independentFiltering' argument of ?results
``````

3) what is the use of this step and when should I perform this and this coef is the same which I used at the res step with contrast

``````res <- lfcShrink(dds, coef="condition_trt_vs_untrt", type="apeglm")
``````

Kind Regards