Entering edit mode
6.7 years ago
Tania
▴
180
Hi all
Does this heatmap look okay to you? Why I can't see any pattern ? https://ibb.co/m8qaQx Here is a snapshot of my code:
cts <- txi$counts
group = factor(c(rep("Control", 10), rep("Tumor",10)) )
dge = DGEList(counts=cts, genes= rownames(cts), group=group)
countsPerMillion <- cpm(dge)
summary(countsPerMillion)
countCheck <- countsPerMillion > 1
summary(countCheck)
keep <- which(rowSums(countCheck) >= 2)
dge <- dge[keep,]
summary(cpm(dge))
dge <- calcNormFactors(dge, method="TMM")
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)
dge <- estimateTrendedDisp(dge)
et <- exactTest(dge, pair=c("Control", "Tumor"))
etp <- topTags(et, n= 100000, adjust.method="BH", sort.by="PValue")
#################################################################
###PLOTTING STARTS HERE
#################################################################
logCPM = countsPerMillion
o = rownames(etp$table[abs(etp$table$logFC)>1 & etp$table$PValue<0.05, ])
logCPM <- logCPM[o[1:100],]
colnames(logCPM) = labels
logCPM <- t(scale(t(logCPM)))
write.csv(logCPM, "ControlTumorCPM.csv")
require("RColorBrewer")
require("gplots")
myCol <- colorRampPalette(c("white", "darkgreen", "red"))(100)
myBreaks <- seq(-3, 3, length.out=101)
heatmap.2(logCPM, col=myCol, breaks=myBreaks, Rowv=TRUE,Colv=TRUE, main="Controls vs Tumors Heatmap", key=T, keysize=0.7,scale="none",trace="none", dendrogram="both", cexRow=0.2, cexCol=0.9, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6),reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean), distfun=function(x) as.dist(1-cor(t(x))), hclustfun=function(x) hclust(x, method="ward.D2"))
dev.off()
Thanks
Your changes are imbalanced (there's almost no downregulated genes). Could you confirm that the data have been log-transformed and/or row-scaled. I'd strongly recommend you pick a colourscheme that will highlight differences between samples: I tend to avoid green-red anyway, but it seems a bit perverse to use green for unchanged, red for upregd and white for downregd - I'd have suggested using white as the midpoint
Thanks so much. I think I did the logscale but not sure about the rowscale?, here is a snapshot of the code, we have down regulation, but most of the top significant genes are up regulated not down regulated, I am plotting the top 100?
Could you put your code into your original question please, so it's a bit more readable
I changed it, thanks
rerun it with the line
logCPM <- log2(countsPerMillion)
The above line throws the following error,
So I changed this line instead:
Then it becomes like this: https://ibb.co/nCNsoH
Do you think it is fixed now? Thanks
Yes it looks ok.
I'd rewrite your script now that it works, so that your variable names reflect what they actually hold; so use logCPM instead of countsPerMillion if you're storing the logged values etc
Please note that you are filtering on p-values rather than FDR, so your results aren't very stringent
Thanks russhh so much :)
Are you expecting to see one? Why?
Heatmaps are just a visualisation tool, not really an analysis tool.
Besides, there's a marked difference between your tumor and WT control as far as I can tell. YOu should probably do an ontology exploration to figure out the story behind your data.
Thanks so much. In the data most of the significant genes, are upregulated not down, I plot only the top 100 in the heatmap.