Question

Can't find a pattern in heatmap, is it okay?

0

Entering edit mode

6.4 years ago

Tania ▴ 180

Hi all

Does this heatmap look okay to you? Why I can't see any pattern ? https://ibb.co/m8qaQx Here is a snapshot of my code:

cts <- txi$counts
group = factor(c(rep("Control", 10), rep("Tumor",10)) )
dge = DGEList(counts=cts, genes= rownames(cts), group=group)
countsPerMillion <- cpm(dge)  
summary(countsPerMillion)
countCheck <- countsPerMillion > 1
summary(countCheck)
keep <- which(rowSums(countCheck) >= 2)
dge <- dge[keep,]
summary(cpm(dge))
dge <- calcNormFactors(dge, method="TMM")
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)
dge <- estimateTrendedDisp(dge)
et <- exactTest(dge, pair=c("Control", "Tumor"))
etp <- topTags(et, n= 100000, adjust.method="BH", sort.by="PValue")
#################################################################
###PLOTTING STARTS HERE
#################################################################

logCPM = countsPerMillion
o = rownames(etp$table[abs(etp$table$logFC)>1 & etp$table$PValue<0.05, ])
logCPM <- logCPM[o[1:100],]
colnames(logCPM) = labels
logCPM <- t(scale(t(logCPM)))
write.csv(logCPM, "ControlTumorCPM.csv")
require("RColorBrewer")
require("gplots")
myCol <- colorRampPalette(c("white", "darkgreen", "red"))(100)
myBreaks <- seq(-3, 3, length.out=101)
heatmap.2(logCPM, col=myCol, breaks=myBreaks, Rowv=TRUE,Colv=TRUE, main="Controls vs Tumors Heatmap", key=T, keysize=0.7,scale="none",trace="none", dendrogram="both", cexRow=0.2, cexCol=0.9, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6),reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean),  distfun=function(x) as.dist(1-cor(t(x))), hclustfun=function(x) hclust(x, method="ward.D2"))
dev.off()

Thanks

RNA-Seq heatmap • 2.8k views

ADD COMMENT • link 6.4 years ago by Tania ▴ 180

2

Entering edit mode

Your changes are imbalanced (there's almost no downregulated genes). Could you confirm that the data have been log-transformed and/or row-scaled. I'd strongly recommend you pick a colourscheme that will highlight differences between samples: I tend to avoid green-red anyway, but it seems a bit perverse to use green for unchanged, red for upregd and white for downregd - I'd have suggested using white as the midpoint

ADD REPLY • link 6.4 years ago by russhh 5.7k

0

Entering edit mode

Thanks so much. I think I did the logscale but not sure about the rowscale?, here is a snapshot of the code, we have down regulation, but most of the top significant genes are up regulated not down regulated, I am plotting the top 100?

logCPM = countsPerMillion
o = rownames(etp$table[abs(etp$table$logFC)>1 & etp$table$PValue<0.05, ])
logCPM <- logCPM[o[1:100],]
colnames(logCPM) = labels
logCPM <- t(scale(t(logCPM)))
write.csv(logCPM, "ControlTumorCPM.csv")
require("RColorBrewer")
require("gplots")
myCol <- colorRampPalette(c("white", "darkgreen", "red"))(100)
myBreaks <- seq(-3, 3, length.out=101)
heatmap.2(logCPM, col=myCol, breaks=myBreaks, Rowv=TRUE,Colv=TRUE, main="Controls vs Tumors Heatmap", key=T, keysize=0.7,scale="none",trace="none", dendrogram="both", cexRow=0.2, cexCol=0.9, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6),reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean),  distfun=function(x) as.dist(1-cor(t(x))), hclustfun=function(x) hclust(x, method="ward.D2"))
dev.off()

ADD REPLY • link 6.4 years ago by Tania ▴ 180

1

Entering edit mode

Could you put your code into your original question please, so it's a bit more readable

ADD REPLY • link 6.4 years ago by russhh 5.7k

0

Entering edit mode

I changed it, thanks

ADD REPLY • link 6.4 years ago by Tania ▴ 180

0

Entering edit mode

rerun it with the line logCPM <- log2(countsPerMillion)

ADD REPLY • link 6.4 years ago by russhh 5.7k

0

Entering edit mode

The above line throws the following error,

Error in hclust(x, method = "ward.D2") : NA/NaN/Inf in foreign function call (arg 11)

So I changed this line instead:

countsPerMillion <- cpm(dge, prior.count=2, log=TRUE)

Then it becomes like this: https://ibb.co/nCNsoH

Do you think it is fixed now? Thanks

ADD REPLY • link 6.4 years ago by Tania ▴ 180

1

Entering edit mode

Yes it looks ok.

I'd rewrite your script now that it works, so that your variable names reflect what they actually hold; so use logCPM instead of countsPerMillion if you're storing the logged values etc

Please note that you are filtering on p-values rather than FDR, so your results aren't very stringent

ADD REPLY • link 6.4 years ago by russhh 5.7k

0

Entering edit mode

Thanks russhh so much :)

ADD REPLY • link 6.4 years ago by Tania ▴ 180

1

Entering edit mode

Are you expecting to see one? Why?

Heatmaps are just a visualisation tool, not really an analysis tool.

Besides, there's a marked difference between your tumor and WT control as far as I can tell. YOu should probably do an ontology exploration to figure out the story behind your data.

ADD REPLY • link 6.4 years ago by Joe 21k

0

Entering edit mode

Thanks so much. In the data most of the significant genes, are upregulated not down, I plot only the top 100 in the heatmap.

ADD REPLY • link 6.4 years ago by Tania ▴ 180