basic statistics books to understand DESeq2
1
3
Entering edit mode
5.3 years ago
Bioinfonext ▴ 470

Hi,

I am from biological background, Could you please advise me to basic biostatistics books that can help me to understand the DESeq2 tutorial?

Kind Regards Bioinfonext

R • 2.8k views
0
Entering edit mode

Have you read the paper and the vignette?

1
Entering edit mode

Yes, I need to understand basic statistics terms used in DESeq2 like:

1) I am always confused with how to design the formula: in case of one factor, two factors and three factors

2) When exactly need to use interaction terms or group all factor into one?

3) what is beta prior?

4) what is shrink log fold changes? I only know log fold change.

5) Dispersion is generally estimation of variance but what is shrinkage?


Kind Regards Bioinfonext

0
Entering edit mode

Try this one.

Beginner’s guide to using the DESeq2 package

Michael Love1∗, Simon Anders2, Wolfgang Huber2

https://bioc.ism.ac.jp/packages/2.14/bioc/vignettes/DESeq2/inst/doc/beginner.pdf

I hope, it is not a double answer

0
Entering edit mode

Thank you very much.

0
Entering edit mode

The larger a dispersion value, the larger the difference in expression has to be in order for a gene to be called DE. As the number of replicates for each condition increases, the amount of dispersion shrinkage per gene decreases as we are then able to estimate the dispersion parameter from the data without shrinkage.

DESeq2 Dispersion Shrinkage - more samples is better?

https://support.bioconductor.org/p/100764/

4
Entering edit mode
5.3 years ago
mmfansler ▴ 460

Modern Statistics for Modern Biology by Holmes and Huber covers DESeq2 in Chapter 8 and tries to provide most of the statistical background to get there with the earlier chapters. It does assume some basic facility with R.

There is currently an online reading group, which meets weekly to present and discuss the chapters and exercises. Chapter 8 is scheduled for discussion on May 29th, 2019.

0
Entering edit mode

Thank you very much to you all. I am not able to understand few things:

1) Which normalized method DESeq used at the step by default: vst or rlog

dds <- DESeq(dds)


2) what are the further step after getting res table:

> res.root.soil= results(diagdds, contrast = c("Tissue", "Root", "Soil"), alpha = 0.1)

> summary(res.root.soil)

out of 30438 with nonzero total read count

LFC > 0 (up)     : 10, 0.033%

LFC < 0 (down)   : 29584, 97%

outliers [1]     : 0, 0%

low counts [2]   : 5664, 19%

(mean count < 0)

[1] see 'cooksCutoff' argument of ?results

[2] see 'independentFiltering' argument of ?results


3) what is the use of this step and when should I perform this and this coef is the same which I used at the res step with contrast

res <- lfcShrink(dds, coef="condition_trt_vs_untrt", type="apeglm")


Kind Regards