Comparison between heat maps corrected with different algorithms
1
0
Entering edit mode
3.5 years ago
Mozart ▴ 330

Dear all,

I've been racking my brain trying to figure out possible ways to plot my an SVA-corrected matrix as an heat map in DESeq2, given the fact that even if I create a sva-corrected dds object, the rld/vsd corrected ones looks identical to the uncorrected ones.

One way of doing this is to apply a base2 logarithm on the corrected matrix but not sure if you have better ideas.

sva • 1.0k views
1
Entering edit mode
3.5 years ago

There is no context as to why you need or want to do this (?), nor whether the SVA correction was even necessary. If used improperly, 'batch' / surrogate variable correction has the potential to introduce more bias than that which originally existed in the data.

In any case, you could generate a simple symmetrical heatmap, like this:

distsRL <- dist(t(assay(rld)))
mat <- as.matrix(distsRL)
rownames(mat) <- colnames(mat)
hc <- hclust(distsRL)
gplots::heatmap.2(mat,
Rowv = as.dendrogram(hc),
symm = TRUE,
trace = 'none',
cexRow = 0.8,
cexCol = 0.8,
margin = c(7.5, 5.5),
key = FALSE)


Cordiali saluti, Kevin.

0
Entering edit mode

Hello Kevin and thanks as usual for throwing great ideas to improve my pipeline. You are absolutely right when you advice about the context to be considered but I must show (orders from above) a comparison between a 'biased' vs 'corrected' heatmap as found in the first figure here. I am not going to use this but I have to show, does it make any sense? to briefly summarise: Once the surrogate variables has been applied to my dds object

dds <- DESeq(dds)


I use this function that I found here on Biostars

cleaningY = function(y, mod, svaobj) {
X = cbind(mod, svaobj$sv) Hat = solve(t(X)%*%X)%*%t(X) beta = (Hat%*%t(y)) P = ncol(mod) cleany = y-t(as.matrix(X[,-c(1:P)])%*%beta[-c(1:P),]) return(cleany) }  the guy is applying the following code to generate a PCA plot:  #-- Clean counts_sva = cleaningY(count, mod, svseq) pca = prcomp(t( counts_sva ), scale=FALSE) plot(pca$x[,1], pca$x[,2]) text(pca$x[,1], pca$x[,2], rownames(pca$x), pos= 3 )


so just wondering if there is a similar way to generate an heatmap plotting genes expressions.

0
Entering edit mode

It makes sense to want to see that, yes. I am not sure about that code from the other Biostars post - no time to check. However, could you not just eliminate the SVA-determine batch effect on the variance-stabilised expression levels using limma::removeBatchEffect() and then plot the correct and uncorrected data in a heatmap?

0
Entering edit mode

Thanks for your reply, Kevin. I am afraid it has been asked to me to plot sva-corrected vs uncorretced heatmap. Not sure if I log transformation (log2+1) could be applied followed by a mean centering is the right way of doing this?

0
Entering edit mode

Yes, limma::removeBatchEffect() can produce the 'corrected' data for you. Then, just use the corrected and uncorrected variance-stabilised data for the heatmap. Prior to generating the heatmap, you can do the standard scaling, if you want:

ComplexHeatmap::Heatmap(data.matrix(t(scale(t(vst)))), ...)

0
Entering edit mode

Thanks Kevin, sorry for not making this clear earlier but I do need to compare exclusively SVA-corrected vs uncorrected (I know it's likely to be wrong/not required/unnecessary). do you feel that by doing this

counts_sva = cleaningY(count, mod, svseq)
data.matrix(t(scale(t(counts_sva))))


is the right way of proceeding?