Question: Comparing multiple conditions and understanding rlog/resLFC
0
gravatar for ccha97
10 weeks ago by
ccha9720
ccha9720 wrote:

Hello, I'm new to R and I'm having some trouble understanding some elements of the DESeq2 package (I'm an undergrad student who's never used the R prior to this project, so any help would be appreciated).

For context, I have three different conditions e.g. A, B, C (A = acute model, B = chronic model, C = a deletion in that chronic model B). I'm wanting to compare A vs B, as well as B vs C but wasn't sure which way to go about it. I was originally using the contrast function:

AB <- results(dds, contrast = c("condition", "A", "B"), alpha = 0.05)

BC <- results(dds, contrast = c("condition", "B", "C"), alpha = 0.05)

My current end goal is to use k-means clustering and form a heatmap. Based on the tutorials, I understand that the rlog function is used when visualising data.

pheatmap(assay(rld)[sigGenesAB,], cluster_rows=FALSE, show_rownames=FALSE,
         cluster_cols=FALSE, annotation_col = as.data.frame(cdata), row.names=rownames(cdata))

In this case [sigGenesAB,] refers to the deferentially expressed genes where the padj value < 0.05. However, when I generate this heatmap, it also includes the condition 'C' and I don't know what to make of it. I'm also unable to use the rlog function on AB as it comes up with this error:

rldAB <- rlog(AB)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘sizeFactors’ for signature ‘"DESeqResults"’

My supervisor suggested using factor levels, his code is something similar to this where he's obtained a matrix including the intercept, with zeroes and ones:

dds$condition <- factor(dds$condition, levels = c("A","B", "C"))    
condition <- factor(rep(c("A","B","C"))) 
model.matrix(~ condition)

I am aware of the Analyzing RNA-seq data with DESeq2 tutorial and have read through the sections (his code seems to be related to log fold shrinking/lfcshrink), but I'm still having trouble understanding things - should I be using rlog or lfcshrink to generate a heatmap? Ultimately, I want to do kmeans clustering and generate a heatmap, as well as investigate those specific clusters using GO-term analysis.

I've thought about making two different data sets (e.g. one with just the counts of A+B, and the other with just B+C) and doing a separate DESeq analysis for each, but it also means I'll have a lot of different variables which will probably get confusing downstream. I'd appreciate any help in understanding some of these concepts, as well as any recommendations regarding how I should approach my data.

heatmap rna-seq deseq2 R • 147 views
ADD COMMENTlink written 10 weeks ago by ccha9720

EDIT: I've added the first line: dds$condition <- factor( c("A","B", "C")) I'm not sure if that will change my contrast results. Is someone able to explain the idea of a model matrix to me? I also have the code for lfcShrink

ABresLFC <- lfcShrink(dds, coef="A_vs_B", type="apeglm")
BCresLFC <- lfcShrink(dds, coef="B_vs_C", type="apeglm")

I also still want to make a heatmap for the differentially expressed genes - I've already stored the genes into variables (sigGenesAB, sigGenesBC), but just need help with coding the heatmap.

ADD REPLYlink modified 10 weeks ago • written 10 weeks ago by ccha9720
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1956 users visited in the last hour