Question: DESeq2: Differential gene expression
gravatar for umeshtanwar2
19 days ago by
umeshtanwar210 wrote:

Hi, I am working on RNA-Seq data for Arabidopsis plant. I have samples from 4 genotypes (A, B, C, D), Untreated and after treatment, with 3 replicates for each. I used STAR for alignment and obtained count.txt file by using featureCounts. Now I am doing the DGE analysis by DESeq2. My coldata looks like this:

        genotype treatment
A.1_Control       A      Mock
A.2_Control       A      Mock
A.3_Control       A      Mock
A.1_PostDL        A   Post-HL
A.2_PostDL        A   Post-HL
A.3_PostDL        A   Post-HL
B.1_Control     B1      Mock
B.2_Control     B1      Mock
B.3_Control     B1      Mock
B.1_PostDL      B1   Post-HL
B.2_PostDL      B1   Post-HL
B.3_PostDL      B1   Post-HL
C.1_Control         C      Mock
C.2_Control         C      Mock
C.3_Control         C      Mock
C.1_PostDL          C   Post-HL
C.2_PostDL          C   Post-HL
C.3_PostDL          C   Post-HL
D.1_Control D      Mock
D.2_Control D      Mock
D.3_Control D      Mock
D.1_PostDL  D   Post-HL
D.2_PostDL  D   Post-HL
D.3_PostDL  D   Post-HL

I am using this code for analysis:

countdata <- read.table("NewTotalCounts.txt", header=TRUE, row.names=1)
countdata <- countdata[ ,6:ncol(countdata)]
 colnames(countdata) <- gsub("\\.[sb]am$", "", colnames(countdata))
countdata <- as.matrix(countdata)
dds <- DESeqDataSetFromMatrix(countData=countdata, colData=coldata, design=~genotype*treatment)
dds <- DESeq(dds)

"1" "Intercept"
"2" "genotype_C_vs_A"   
"3" "genotype_D_vs_A"
"4" "genotype_B_vs_A"
"5" "treatment_Post.HL_vs_Mock"
"6" "genotypeC.treatmentPost.HL"
"7" "genotypeD.treatmentPost.HL"
"8" "genotypeB.treatmentPost.HL"

I extracted the DE genes for each contrast as:

 > res_C <- results(dds, contrast="genotype_C_vs_A", alpha=0.05, lfcThreshold=0)
 > summary(res_C)
out of 27683 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)       : 115, 0.42%
LFC < 0 (down)     : 107, 0.39%
outliers [1]       : 2, 0.0072%
low counts [2]     : 6909, 25%
(mean count < 4)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

I am generating the VennDiagram as:

venn(list(B_vs_A=rownames(res_B[which(res_B$padj <= 0.05),]), C_vs_A=rownames(res_C[which(res_C$padj <= 0.05),]), D_vs_A=rownames(res_D[which(res_D$padj <= 0.05),])))

My questions are:

  1. I am not able to understand the contrasts it is making like; contrast "genotype_C_vs_A" is under Mock condition or after treatment; what is "treatment_Post.HL_vs_Mock" - it is for which sample, and "genotypeC.treatmentPost.HL" - it is after treatment vs A or vs C(Mock) ?

  2. How can I extract these DE genes?

  3. How can I extract the DE genes which are common in two or all the contrasts?

Any help from you will be very helpful for me. Thank you

rna-seq R • 254 views
ADD COMMENTlink modified 19 days ago by swbarnes25.3k • written 19 days ago by umeshtanwar210

I've reformated your post somewhat to make it more readable. This has mostly invovled setting the output of your R commnads to be formatted as code, which can be done by indenting it 4 spaces or using the code button (marked 101010 in the toolbar).

ADD REPLYlink written 19 days ago by i.sudbery4.3k

Thank you so much @ i.sudbery.

ADD REPLYlink written 19 days ago by umeshtanwar210
gravatar for i.sudbery
19 days ago by
Sheffield, UK
i.sudbery4.3k wrote:

Question 1:

The contrasts 2-4 (i.e. "genotype_C_vs_A" etc) are the effect of genotype after the effect of light has been regressed out (or averaged over). It is effectively the expression in all the C plants (HL or Mock) vs expression in all the A plants (HL or Mock).

Contrast 5 (i.e. "treatment_Post.HL_vs_Mock") is the effect of light treatment after the effect of genotype has been regressed out (or averaged over). It is effectively the expression in all Post.HL plants (genotype A, B, C or D) vs the expression in all Mock treated plants (genotype A, B, C or D).

The remaining contrasts are the interaction effects - how does the the effect of light in genotype B (for example) differ from the effect of light on genotype A. I think. The interaction effects are always a bit difficult to understand. See this part of the DESeq user guide for more info.

Question 2:

Extract the DE genes the way you did for your venn diagram:

de_genes_genotypeC_vs_A <- res_b[res_b$padj < 0.05, ]

Question 3:

The theoretically correct way to do this is to design the correct contrast. For testing entire categories, you might like to look at the likelihood-ratio test section of the DESeq manual. For example to test for differences anywhere, you'd run your DESeq like:

dds <- DESeq(dds, test="LRT", reduced = ~1)
results_any_coeff <- results(dds)
DE_any_coeff <- results_any_coeff[results_any_coef$padj < 0.05,]

The other way to do if would be to generate the results tables for each constrast, and then use intersect on the results rownames. This is easier, but less elegant, less accurate, and more prone to threshold errors.

ADD COMMENTlink written 19 days ago by i.sudbery4.3k

Thank you a lot @i.sudbery. This was really very helpful.

ADD REPLYlink written 12 days ago by umeshtanwar210
gravatar for swbarnes2
19 days ago by
United States
swbarnes25.3k wrote:

I found this site helpful, the descriptions for querying interactions are at the bottom

ADD COMMENTlink written 19 days ago by swbarnes25.3k

Thank you so much @swbarnes2. I am more clear about interaction effects after reading this tutorial.

ADD REPLYlink written 12 days ago by umeshtanwar210
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1793 users visited in the last hour