We have whole transcriptome data and used deseq2 to determine differentially expressed genes. I am able to run deseq2 up to drawing heatmaps in R with gplot and heatmap.2 library. Though I would like to do two more things:
- Group differentially expressed genes by pathways, for example in a pool of 100 DEG, cluster them by their biological process GO term.
- Among DE genes in my matrix, only select a few of interest (let say 10) and display them in a separate heatmap. Since genes are raw names, I understand they should be in a separate vector, though I have no clue how to do that..
Our dataset looks like that with gene name in row and samples in column, read_count is the content:
Gene_name Sample1 Sample2 Sample3 [...] Gene1 10 11 1000 Gene2 1000 12 1 Gene3 2999 2222 1
Here is how I used Deseq2 and built heatmaps from it:
# Define samples/control samples <- data.frame(row.names=c("sample1”, "sample2”, "sample3”, "tumor1”, "tumor2”, "tumor3”), condition=as.factor(c(rep("sample",3), rep("tumor", 3)))) samples$condition <- relevel(samples$condition, "control") # Launch DESeq2 dds <- DESeqDataSetFromMatrix(...etc..) dds <- DESeq(dds, betaPrior=FALSE) rld <- rlogTransformation(dds) DEgenes = ... as per our criteria # Heatmap [define heatmap size etc.] hmcol<- colorRampPalette( rev(brewer.pal(9, "RdBu")))(255) heatmap.2( assay(rld)[DEgenes, ], Colv=FALSE, scale="row", trace="none", dendrogram="row", key = FALSE, lmat=lmat, lhei=lhei, lwid=lwid, col = hmcol, margin=c(4, 10), cexCol = 1)