Question: Heatmap.2 R for differentially expressed genes DeSEQ2
2
gravatar for User6891
3.2 years ago by
User6891250
Europe
User6891250 wrote:

Hi,

I'm trying to make a heatmap for a set of differentially expressed genes. I'm using deseq2 in R together with the heatmap.2 function

This is the command I've used

dds <- DESeq(dds)
sig <- resOrdered[!is.na(resOrdered$padj) & resOrdered$padj<0.10  abs(resOrdered$log2FoldChange)>=1,]

selected <- rownames(sig)
hmcol <- colorRampPalette(brewer.pal(9, "GnBu"))(100)
heatmap.2(log2(counts(dds,normalized=TRUE)[rownames(dds) %in% selected,]), col = hmcol, scale="row", rowv = TRUE, colv = FALSE, dendrogram="row", trace="none", margin=c(4,6), cexRow=0.5, cexCol=1, keysize=1)

when running this I get the error

Error in hclustfun(distr) : NA/NaN/Inf in foreign function call (arg 11)

Any ideas how to solve this?

R • 10k views
ADD COMMENTlink modified 8 months ago by RamRS21k • written 3.2 years ago by User6891250

@User6891 do you want to receive help ? if yes, you should give an example input so people can reproduce the error you have and can then help you with it.

ADD REPLYlink modified 8 months ago by RamRS21k • written 3.2 years ago by Mo880

I'm aware of this. The problem is that 'dds' is a 'DeSeqDataSet', so it's not that easy to just reproduce this here. I suppose that some columns contains N/A values which heatmap.2 doesn't like? But it's quite normal that an expression dataset contains N/A values, so I was wondering if somebody had encountered the same problem

ADD REPLYlink modified 8 months ago by RamRS21k • written 3.2 years ago by User6891250

I checked & the problem is that there are some '-Inf' values in the dds matrix. Any idea how to deal with those?

ADD REPLYlink modified 8 months ago by RamRS21k • written 3.2 years ago by User6891250

@User6891 There are few options, for example you can replace zero instead NA, so you won't face any problem. Do this

YourDataName[is.na(YourdataName)] <- 0

and check if it works , also a similar problem was asked on stack overflow

ADD REPLYlink modified 8 months ago by RamRS21k • written 3.2 years ago by Mo880

Inf comes because you have zeroes into your data. Log of 0 is infinite

A general function to avoid this is yomake the zeroes = 1

Log2(counts(dds... contains some zeroes...

ADD REPLYlink modified 8 months ago by RamRS21k • written 3.2 years ago by Antonio R. Franco4.0k

Hi @mbio.kyle,

when I try to follow you first lot of code:

> matrix<-matrix-rowMeans(matrix)

I get:

Error in rowMeans(matrix) : 
  'x' must be an array of at least two dimensions.

Please help??

ADD REPLYlink modified 8 months ago by RamRS21k • written 2.2 years ago by zoegward60

On the previous line you're missing a comma to index rld by topVarGenes in two dimension way:

# use the log transform on the data set
matrix <- assay(rld)[ topVarGenes, ]
matrix <- matrix - rowMeans(matrix)
ADD REPLYlink modified 13 months ago • written 13 months ago by leaodel60
9
gravatar for mbio.kyle
3.2 years ago by
mbio.kyle330
United States
mbio.kyle330 wrote:

You do not want to make heatmaps from the DeSeqDataSet object. Here is an example workflow. Note it uses the pheatmap R library

# make the DeSeqDataSet
dds <- DESeq(dds)

# use the log transform on the data set
rld <- rlog(dds, blind=F)
topVarianceGenes <- head(order(rowVars(assay(rld)), decreasing=T),100)
matrix <- assay(rld)[ topVarGenes ]
matrix <- matrix - rowMeans(matrix)

# select the 'contrast' you want
annotation_data <- as.data.frame(colData(rld)[c("ConditionA","ConditionB")])
pheatmap(matrix, annotation_col=annotation_data)

The above will plot the most varied genes across all conditions. You can substitute the 100 for whichever number you like to control how many you get. If you have alot of different conditions/contrasts and want to make some more specific heatmaps, here is a function which will plot the top 20 genes for the up and down direction (i.e. largest log2FoldChange between the two conditions up and down)

plotUpDownSigGenes <- function(results, colNums, rld, title) {

    # make the lists
    upgenes <- rownames(head(results[ order( results$log2FoldChange ), ], n=20))
    downgenes <- rownames(head(results[ order( -results$log2FoldChange ), ], n=20))

    # this gives us the rows we want
    rows <- match(upgenes, row.names(rld))
    mat <- assay(rld)[rows,colNums]
    mat <- mat - rowMeans(mat)

    # the labels are hard coded at the moment :(
    df <- as.data.frame(colData(rld)[c("labelA","labelB")])
    pheatmap(mat, fontsize=5, annotation_col=df, main=paste(title,"top 20 up genes"))

    # this gives us the rows we want
    rows <- match(downgenes, row.names(rld))
    mat <- assay(rld)[rows,colNums]
    mat <- mat - rowMeans(mat)

    df <- as.data.frame(colData(rld)[c("labelA","labelB")])
    pheatmap(mat, fontsize=5, annotation_col=df, main=paste(title,"top 20 down genes"))
}

Here is an example call for the function:

contrastDEGenes <- subset(results(dds, contrast=c("A","B")), padj < 0.05)

# this part is kind of funky
# the function needs to know which columns
# correspond to the samples (to pull from rld)
aCols <- c(1,2,3)
bCols <- c(4,5,6)

# get the log transforms again
rld <- rlog(dds, blind=F)

# call
plotUpDownSigGenes(
    contrastDEGenes,
    c(aCols, bCols),
    rld,
    "Title for the plot"
)

Hope it helps!

ADD COMMENTlink modified 8 months ago by RamRS21k • written 3.2 years ago by mbio.kyle330

Thanks for the code.

I am trying to run your function and I am getting the following error:

Error in checkContrast(contrast, resNames) : 
  'contrast', as a character vector of length 3, should have the form:
contrast = c('factorName','numeratorLevel','denominatorLevel')

I am not sure how to proceed. Could you please help me?

ADD REPLYlink modified 8 months ago by RamRS21k • written 16 months ago by thiagovidotto0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1336 users visited in the last hour