Question: Heatmap for differential gene expressions
0
gravatar for Sharon
2.6 years ago by
Sharon470
Sharon470 wrote:

I am trying to do some clustering. I need to get heatmap plot for edgeR gene expressions. I usually get the following error, although I ensured data1 is matrix. Any hint? Thanks

Error in heatmap.2(data1, col = col.pan, Rowv = TRUE, scale = "none", : `x' must be a numeric matrix

et <- exactTest(dge, pair=c("ctrl", "tr"))
etp <- topTags(et, n=2000000)
data1 <- as.matrix(etp$table$logFC)
heatmap.2(data1, col=col.pan, Rowv=TRUE, scale="none",trace="none", dendrogram="both", cexRow=1, cexCol=1.4, density.info="none",margin=c(10,9), lhei=c(2,10), lwid=c(2,6))
heatmap clustering edger • 2.4k views
ADD COMMENTlink modified 2.6 years ago by jaro.slamecka80 • written 2.6 years ago by Sharon470

Hi Sharon, how are you?

In your code above, data1 will just consist of a single vector of log base 2 fold changes, and you will not be able to generate a heatmap with that.

What exactly are you aiming to do?

ADD REPLYlink written 2.6 years ago by Kevin Blighe60k

Hi Kevin, I think you are right. I am trying to plot my genes expressions using heatmap. But still if I call with etp$table for all the matrix, it gives the same error.

ADD REPLYlink written 2.6 years ago by Sharon470

I believe that the 'table' object of etp is just P values, fold changes, and other stats.

I believe that you have to do something like this:

significantGenes <- rownames(with(etp$table, subset(abs(logFC)>2 & FDR<0.05)))

heatmap.2(data.matrix(ExprMatrix[significantGenes,]), ...)

The first line extracts gene names (assuming gene names are rownames of etp$table) that have absolute log2 FC > 2 and adjusted P value <0.05. The second line subsets your expression matrix (assuming it's called *ExprMatrix*) and concurrently performs the heatmap.2 function.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Kevin Blighe60k

Sorry for this noise. But is not ExprMatrix just my etp?

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Sharon470
1

Hey Sharon,

Your original expression matrix will be whatever you passed as the 'counts' argument to DGEList(). For example, DGEList(counts=counts, group=1:2)

Much as I am aware, etp just contains information on the differential expression analysis.

names( etp$table )

[1] "logConc" "logFC" "p.value"

The object produced by exactTest contains three elements: table, comparison and genes. The element de.com$comparison contains a vector giving the names of the two groups compared. The table de.com$table contains the elements logConc, which gives the overall concentration for a tag across the two groups being compared, logFC, which gives the log-fold change difference for the counts between the groups and p.value gives the exact p-values computed.

[source: page 23/24 of https://bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf]

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Kevin Blighe60k

Finally works, thanks Kevin !

ADD REPLYlink written 2.6 years ago by Sharon470
1

Great - good luck with the rest of the project

ADD REPLYlink written 2.6 years ago by Kevin Blighe60k
0
gravatar for jaro.slamecka
2.6 years ago by
NIH National Center for Advancing Translational Sciences
jaro.slamecka80 wrote:

Hi Sharon, you're calling the heatmap.2 function on a vector of fold changes (etp$table$logFC). You should call it on a matrix of normalized expression values in dge$E. Before that, subset the dge object based on the most interesting top tags from etp.

ADD COMMENTlink written 2.6 years ago by jaro.slamecka80

Hi jaro,

So you think I call it with etp from the above code itself? it still gives the same error. or you mean I create a separate matrix of all the gene names and the logfc only? I will play around and see. hopefully.

ADD REPLYlink written 2.6 years ago by Sharon470

Your etp object only contains ranked genes based on the evidence of differential expression. So you can't use it because it no longer has the actual expression values. But you need it to first decide which genes you want to plot onto the heatmap by taking them out of etp first by doing something like the line below, in principle. It does the same as Kevin Blighe's first line above, it subsets the etp table to only keep genes with logFC greater and lower than 2 and p.value below 0.02 (these cutoffs you might have to adjust to keep, say, a few hundred genes) and then takes out the rownames which should be some kind of gene IDs (like ensembl_gene_id, depending on how you did your annotation), these you'll need in the next step.

diff.genes = rownames(etp$table[abs(etp$table$logFC)>2 & etp$table$p.value<0.02, ])

Then, based on diff.genes, you'll have to subset your dge object because that's the one that has the normalized expression values:

dge.subset = dge[diff.genes, ]

The expression matrix that you'll pass to heatmap.2 is then: dge.subset$E

heatmap.2(dge.subset$E...

If you get errors, you can do this and paste the output here:

rownames(etp$table)[1:10]
colnames(etp$table)
rownames(dge)[1:10]

(BTW, one little detail, you can't create a "matrix" of gene names and logFC because values in a matrix have to be of the same type, you could create a data.frame though:))

ADD REPLYlink written 2.6 years ago by jaro.slamecka80

Finally works, thanks Jaro !

ADD REPLYlink written 2.6 years ago by Sharon470
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 950 users visited in the last hour