Question: DESeq2 compare all levels
1
gravatar for rmf
2.4 years ago by
rmf980
rmf980 wrote:

For RNA-Seq DGE analysis using DESeq2, if I have a factor with let's say 5 levels:

condition
A
A
B
B
C
C
D
D
E
E

The standard workflow

d <- DESeqDataSetFromMatrix(countData=counts,colData=meta,design=as.formula(~condition))
d <- DESeq2::estimateSizeFactors(d,type="ratio")
d <- DESeq2::estimateDispersions(d)
d1 <- nbinomWaldTest(d)
resultsNames(d1)

gives the following contrast combinations:

condition_A_vs_B
condition_A_vs_C
condition_A_vs_D
condition_A_vs_E

and then:

results(d1,name="condition_A_vs_B")

But, how do I get all pairwise combinations of all levels? ie; The first combinations as well as these below?

B - C
B - D
B - E
C - D
C - E
D - E
ADD COMMENTlink modified 10 days ago by babankolte.220 • written 2.4 years ago by rmf980

Hi Kevin, I am using your following code but getting error. I don't no why it is showing. Any help will be appreciated.

log2cutoff <- 2
qvaluecutoff <- 0.05

sigGenes <- unique(c(
  rownames(subset(res1, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
  rownames(subset(res2, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
  rownames(subset(res3, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
))

heat <- assay(rld)[sigGenes,]

Error:
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'unique': argument 5 is empty
ADD REPLYlink modified 10 days ago • written 10 days ago by babankolte.220

Hi friend,

Can you run either of these commands on their own to see what is produced?

rownames(subset(res1, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff))

rownames(subset(res2, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff))

rownames(subset(res3, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff))

, or, even:

subset(res1, padj<=qvaluecutoff & abs(log2FoldChange))

subset(res2, padj<=qvaluecutoff & abs(log2FoldChange))

subset(res3, padj<=qvaluecutoff & abs(log2FoldChange))

What is the output of head(res1), head(res2), and head(res3)?

ADD REPLYlink modified 10 days ago • written 10 days ago by Kevin Blighe67k

Hi Kevin, The output of head (res1), res2, and res3 are the comparison that I have taken for analysis, but when I put this all together and assigned to one variable like same in your code i.e sigGenes and it giving an error.

This is my comparison: res1 <- results(dds, contrast=c("condition", "TUB08", "TUB00"))

res2 <- results(dds, contrast=c("condition", "TUB10", "TUB00"))

res3 <- results(dds, contrast=c("condition", "TUB12", "TUB00"))

output example of head(res1): log2 fold change (MLE): condition TUB08 vs TUB00

Wald test p-value: condition TUB08 vs TUB00

DataFrame with 6 rows and 6 columns

baseMean log2FoldChange lfcSE stat pvalue padj

thrL 5547.2000 0.00245135 0.242725 0.0100993 9.91942e-01 9.94555e-01

I want to generate heatmap of all significantly expressed genes from all the above comparisons but can not proceed because of the error.

ADD REPLYlink modified 10 days ago • written 10 days ago by babankolte.220
1

Can you please use ADD REPLY to keep this thread organized rather than the answer field which is for, well, answers.

ADD REPLYlink written 10 days ago by ATpoint41k
13
gravatar for Kevin Blighe
2.4 years ago by
Kevin Blighe67k
Republic of Ireland
Kevin Blighe67k wrote:

To get all pairwise comparisons, it should be as easy as:

BC <- results(dds, contrast=c("condition", "B", "C"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
BC <- lfcShrink(dds, contrast=c("condition", "B", "C"), res=BC)

BD <- results(dds, contrast=c("condition", "B", "D"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
BD <- lfcShrink(dds, contrast=c("condition", "B", "D"), res=BD)

BE <- results(dds, contrast=c("condition", "B", "E"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
BE <- lfcShrink(dds, contrast=c("condition", "B", "E"), res=BE)

CD <- results(dds, contrast=c("condition", "C", "D"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
CD <- lfcShrink(dds, contrast=c("condition", "C", "D"), res=CD)

CE <- results(dds, contrast=c("condition", "C", "E"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
CE <- lfcShrink(dds, contrast=c("condition", "C", "E"), res=CE)

DE <- results(dds, contrast=c("condition", "D", "E"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
DE <- lfcShrink(dds, contrast=c("condition", "D", "E"), res=DE)

Note that I use lfcShrink, which you should also use if you had previously used the function DESeq() with betaPrior=FALSE

You can also do a LRT (likelihood ratio test if you want to compare all levels simultaneously).

What you can do after this is then concatenate all of the statistically significant genes into a unique list and cluster your samples using these, which should bring out good separation in a sample dendrogram (using the regularised log or variance-stabilised counts).

Kevin

ADD COMMENTlink modified 9 weeks ago • written 2.4 years ago by Kevin Blighe67k

What you can do after this is then concatenate all of the statistically significant genes into a unique list and cluster your samples >using these, which should bring out good separation in a sample dendrogram (using the regularised log or variance-stabilised counts)

I am very much interested in doing that, but I have no idea how I would go about doing that step-by-step. Could you give me any pointers as to which tools to use for the concatenating and clustering? I would appreciate this very much.

ADD REPLYlink written 21 months ago by paul-georg10
1

Hey paul, you can do something like this. If you have conducted 3 different differential expression comparisons, you can subset the results objects (res1, res2, res3) to include the statistically significant genes. You then subset your original data matrix for these genes.

log2cutoff <- 2
qvaluecutoff <- 0.05

sigGenes <- unique(c(
  rownames(subset(res1, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
  rownames(subset(res2, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
  rownames(subset(res3, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
))

heat <- assay(rld)[sigGenes,]

Makes sense?

ADD REPLYlink modified 21 months ago • written 21 months ago by Kevin Blighe67k
1

Oh, yes! Absolutely. I did not think it could be that easy. Thank you very much!

ADD REPLYlink written 21 months ago by paul-georg10

in case of LRT what i get is we do get genes that vary across condition. Is it possible to know if a gene is deferentially expressed which condition is it coming from ?

ADD REPLYlink written 3 months ago by krushnach80850
1

The pairwise comparisons should reveal that, no? - or, could simply plot a box-and-whiskers plot.

ADD REPLYlink written 3 months ago by Kevin Blighe67k

okay that sounds simple and straightforward

ADD REPLYlink written 3 months ago by krushnach80850

Hi Kevin

Thanks for the solution you offered for pairwise comparison. During lfcshrinkage, when I try using apeglm method it throws the following error

AL_exp1_vs_SE_exp1 <- lfcShrink(dds, contrast=c("Comparison", "AL_exp1", "SE_exp1"), res=AL_exp1_vs_SE_exp1, type="apeglm")

Error in lfcShrink(dds, contrast = c("Comparison", "AL_exp1", "SE_exp1"),  : 
  type='apeglm' shrinkage only for use with 'coef'

I am unable to understand why I can't use it. I am trying to follow DESeq2 manual where I learned it's better than other shrinkage methods.

ADD REPLYlink written 7 weeks ago by rohitsatyam102200
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 912 users visited in the last hour