Question: DESeq2: Differential gene expression
gravatar for umeshtanwar2
17 months ago by
umeshtanwar220 wrote:

Hi, I am working on RNA-Seq data for Arabidopsis plant. I have samples from 4 genotypes (A, B, C, D), Untreated and after treatment, with 3 replicates for each. I used STAR for alignment and obtained count.txt file by using featureCounts. Now I am doing the DGE analysis by DESeq2. My coldata looks like this:

        genotype treatment
A.1_Control       A      Mock
A.2_Control       A      Mock
A.3_Control       A      Mock
A.1_PostDL        A   Post-HL
A.2_PostDL        A   Post-HL
A.3_PostDL        A   Post-HL
B.1_Control     B1      Mock
B.2_Control     B1      Mock
B.3_Control     B1      Mock
B.1_PostDL      B1   Post-HL
B.2_PostDL      B1   Post-HL
B.3_PostDL      B1   Post-HL
C.1_Control         C      Mock
C.2_Control         C      Mock
C.3_Control         C      Mock
C.1_PostDL          C   Post-HL
C.2_PostDL          C   Post-HL
C.3_PostDL          C   Post-HL
D.1_Control D      Mock
D.2_Control D      Mock
D.3_Control D      Mock
D.1_PostDL  D   Post-HL
D.2_PostDL  D   Post-HL
D.3_PostDL  D   Post-HL

I am using this code for analysis:

countdata <- read.table("NewTotalCounts.txt", header=TRUE, row.names=1)
countdata <- countdata[ ,6:ncol(countdata)]
 colnames(countdata) <- gsub("\\.[sb]am$", "", colnames(countdata))
countdata <- as.matrix(countdata)
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~genotype*treatment)
dds <- DESeq(dds)

"1" "Intercept"
"2" "genotype_C_vs_A"   
"3" "genotype_D_vs_A"
"4" "genotype_B_vs_A"
"5" "treatment_Post.HL_vs_Mock"
"6" "genotypeC.treatmentPost.HL"
"7" "genotypeD.treatmentPost.HL"
"8" "genotypeB.treatmentPost.HL"

I extracted the DE genes for each contrast as:

 > res_C <- results(dds, contrast="genotype_C_vs_A", alpha=0.05, lfcThreshold=0)
 > summary(res_C)
out of 27683 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)       : 115, 0.42%
LFC < 0 (down)     : 107, 0.39%
outliers [1]       : 2, 0.0072%
low counts [2]     : 6909, 25%
(mean count < 4)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

I am generating the VennDiagram as:

venn(list(B_vs_A=rownames(res_B[which(res_B$padj <= 0.05),]), C_vs_A=rownames(res_C[which(res_C$padj <= 0.05),]), D_vs_A=rownames(res_D[which(res_D$padj <= 0.05),])))

My questions are:

  1. I am not able to understand the contrasts it is making like; contrast "genotype_C_vs_A" is under Mock condition or after treatment; what is "treatment_Post.HL_vs_Mock" - it is for which sample, and "genotypeC.treatmentPost.HL" - it is after treatment vs A or vs C(Mock) ?

  2. How can I extract these DE genes?

  3. How can I extract the DE genes which are common in two or all the contrasts?

Any help from you will be very helpful for me. Thank you

rna-seq R • 970 views
ADD COMMENTlink modified 17 months ago by swbarnes28.6k • written 17 months ago by umeshtanwar220

I've reformated your post somewhat to make it more readable. This has mostly invovled setting the output of your R commnads to be formatted as code, which can be done by indenting it 4 spaces or using the code button (marked 101010 in the toolbar).

ADD REPLYlink written 17 months ago by i.sudbery9.1k

Thank you so much @ i.sudbery.

ADD REPLYlink written 17 months ago by umeshtanwar220
gravatar for i.sudbery
17 months ago by
Sheffield, UK
i.sudbery9.1k wrote:

Question 1:

The contrasts 2-4 (i.e. "genotype_C_vs_A" etc) are the effect of genotype after the effect of light has been regressed out (or averaged over). It is effectively the expression in all the C plants (HL or Mock) vs expression in all the A plants (HL or Mock).

Contrast 5 (i.e. "treatment_Post.HL_vs_Mock") is the effect of light treatment after the effect of genotype has been regressed out (or averaged over). It is effectively the expression in all Post.HL plants (genotype A, B, C or D) vs the expression in all Mock treated plants (genotype A, B, C or D).

The remaining contrasts are the interaction effects - how does the the effect of light in genotype B (for example) differ from the effect of light on genotype A. I think. The interaction effects are always a bit difficult to understand. See this part of the DESeq user guide for more info.

Question 2:

Extract the DE genes the way you did for your venn diagram:

de_genes_genotypeC_vs_A <- res_b[res_b$padj < 0.05, ]

Question 3:

The theoretically correct way to do this is to design the correct contrast. For testing entire categories, you might like to look at the likelihood-ratio test section of the DESeq manual. For example to test for differences anywhere, you'd run your DESeq like:

dds <- DESeq(dds, test="LRT", reduced = ~1)
results_any_coeff <- results(dds)
DE_any_coeff <- results_any_coeff[results_any_coef$padj < 0.05,]

The other way to do if would be to generate the results tables for each constrast, and then use intersect on the results rownames. This is easier, but less elegant, less accurate, and more prone to threshold errors.

ADD COMMENTlink written 17 months ago by i.sudbery9.1k

Thank you a lot @i.sudbery. This was really very helpful.

ADD REPLYlink written 17 months ago by umeshtanwar220
gravatar for swbarnes2
17 months ago by
United States
swbarnes28.6k wrote:

I found this site helpful, the descriptions for querying interactions are at the bottom

ADD COMMENTlink written 17 months ago by swbarnes28.6k

Thank you so much @swbarnes2. I am more clear about interaction effects after reading this tutorial.

ADD REPLYlink written 17 months ago by umeshtanwar220
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 857 users visited in the last hour