Question: Differential gene expression analysis by DESeq2
0
gravatar for umeshtanwar2
6 weeks ago by
umeshtanwar210
umeshtanwar210 wrote:

Hi all, I am doing Differential Gene Expression Analysis using DESeq2. I have 8 samples in total (4 treated and 4 untreated) with 3 replicates of each. I am using the the code given below:

library(DESeq2)
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~genotype*treatment)

For extracting the results I tried 2 codes: Method A:

GO1 <- results(dds, name=c("genotype_B_vs_Col.0"), alpha=0.05, lfcThreshold=2)
GO1 = subset(GO1, padj<0.05)
summary(GO1)
out of 3 with nonzero total read count
adjusted p-value < 0.05
LFC > 2.00 (up)    : 3, 100%
LFC < -2.00 (down) : 0, 0%

Method B:

GO <- results(dds, name=c("genotype_B_vs_Col.0"), alpha=0.05)
GO <- subset(GO, log2FoldChange >1 | log2FoldChange <1)
GO = subset(GO, padj<0.05)
summary(GO)
out of 2287 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)       : 1156, 51%
LFC < 0 (down)     : 1131, 49%

I am sorry this kind of question has been explained here many times but I am still confused. Question1: Which method is correct using lfcThreshold filtering (A) or only alpha value(B) and if its A what should be the lfcThreshold value to be used? Question2: Why there is difference in these 2 results? (log2FC 1 = FC 2 as I understand)

Could anyone help me in this please. Thank you

lfcthreshold rna-seq deseq2 • 161 views
ADD COMMENTlink modified 6 weeks ago by Carlo Yague4.6k • written 6 weeks ago by umeshtanwar210
1
gravatar for Carlo Yague
6 weeks ago by
Carlo Yague4.6k
Belgium
Carlo Yague4.6k wrote:

There are two confusions here. First, log2FC 1 = FC 2 is true. However, lfcThresholdin the first instance is also in log (thats what the first letter "l" stand for) ! So if you want to compare your two methods, you should use the same value for the log2 threshold (1 for instance).

Now that this is out of the way, the main issue here is that you are not testing the same thing with both method, so the pvalues are different. In method A, you are testing for differences of expression significantly bigger than the lfcThreshold. In B, you are testing for differences of expression significantly bigger than 0, and subsequently filter for those with |log2FoldChange| > 1.

To illustrate the distinction, if a gene is overexpressed a bit more than two-folds (say, FC=2.1), it might be that this overexpression is not significantly bigger than 2 (so it doesn't pass the test in method A) while it is very significantly different than 0 (it passes the test in method B), depending on the number of reads supporting the overexpression. Overall, method A is more stringent than method B.

Hope this helps,

Carlo.

ADD COMMENTlink written 6 weeks ago by Carlo Yague4.6k

Thank you very much Carlo. This is really very helpful.

ADD REPLYlink written 6 weeks ago by umeshtanwar210
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 803 users visited in the last hour