Question: heatmap in R
0
gravatar for rhasanvandj
9 weeks ago by
rhasanvandj20
rhasanvandj20 wrote:

hi friends I did differential expression analysis for HT-Seq data and I want to plot heatmap for significant genes. But my heatmap is not good. I compared two groups and my workflow is as below:

rdata <- read.table("myData.txt", header = TRUE, row.names = 1, stringsAsFactors = FALSE)

library(pheatmap)
library(DESeq2)
library(ggplot2)

## Differential abundance
alpha <- 0.05 #set the cutoff value
##fold change cut-off
beta <- 2 
## Create metadata 
sample_org <- data.frame(row.names = colnames(rdata), c(rep("0", 22), rep("1", 22)))
colnames(sample_org) <- c("Group")

dds <- DESeqDataSetFromMatrix(countData = rdata,
                              colData = sample_org,
                              design = ~Group)

dd <- DESeq(dds)


res <- results(dd)
#Saving results
write.csv(res,"res.csv")
## remove NAs
res2 <- res[!is.na(res$padj),]
###########################################################

## subset only significant genes
sig <- res2[res2$padj < alpha,]
sig2 <- sig[sig$log2FoldChange > beta,]
sig_genes <- rownames(sig2)
subset <- rdata[sig_genes,]
## log transform data for visualization
tdata <- log2(subset + 0.5)
mat <- as(tdata, "matrix")

## Set colours
my_colour = list(Group = c("0" = "blue", "1" = "yellow"))

#################
pheatmap(symbreaks = FALSE, cluster_cols = FALSE,
         cluster_rows = TRUE,color = colorRampPalette(c("#f71616","#f71616","white",
                                                        "#1919d4", "#1919d4"))(100),annotation_col = sample_org, annotation_colors = my_colour, mat, scale = "row")

I want a heatmap like this

enter image description here

that clearly show difference between two groups, over represented genes together and under represented genes together:

My heatmap is here : enter image description here

with no clear pattern

Any body can help me with this plz?

rna-seq deseq2 R • 243 views
ADD COMMENTlink modified 9 weeks ago by luiz.rocha30 • written 9 weeks ago by rhasanvandj20
4
gravatar for Jean-Karim Heriche
9 weeks ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche24k wrote:

The images you want to show are not linked. Also you seem to be plotting the log of the raw data. Try using the normalized counts instead or the log fold change since this is usually what we're interested in. Check the DESeq2 vignette section on heatmaps.

ADD COMMENTlink written 9 weeks ago by Jean-Karim Heriche24k

Hello Jean-Karim Can you explain how should I have normalized count? I usually use: log2(data + 0.5) to normalize my data. I dont know how to do normalized count?

Also, what do you mean by "use log fold change"? do you mean instead of FDR cut off for selection significant genes I use fold change cut off?

ADD REPLYlink written 8 weeks ago by rhasanvandj20

DESeq2 does normalization by dividing the counts by a sample-specific factor (details can be found in the DESeq paper). You can access normalized counts with the counts() function using the argument normalized=TRUE. Check the package documentation and the vignette. My suggestion was to also/alternatively visualize the log fold changes as heatmap.

ADD REPLYlink written 8 weeks ago by Jean-Karim Heriche24k

Thanks Karim Is there any example about visualizing log fold change as heatmap? because I still dont get this part. at the moment I have significant genes as rows and patient IDs as columns. I dont know how to replace these with log fold change?

ADD REPLYlink modified 8 weeks ago • written 8 weeks ago by rhasanvandj20
3
gravatar for luiz.rocha
9 weeks ago by
luiz.rocha30
luiz.rocha30 wrote:

You can try to use scale() instead of log or change your clustering strategy for rows.

ADD COMMENTlink written 9 weeks ago by luiz.rocha30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2201 users visited in the last hour