Question: Heatmap with categorical variables and with phylogenetic tree in R or Python
3.3 years ago by
tlorin240 wrote:

Hi everyone! :)

I have a question and did not find any answer by personal search. I would like to make a heatmap with categorical variables (a bit like this one: heatmap-like plot, but for categorical variables ), and I would like to add on the left side a phylogenetic tree (like this one : how to create a heatmap with a fixed external hierarchical cluster ). The ideal would be to adapt the second one since it looks much prettier! ;)

Here is my data:

  • a newick-formatted phylogenetic tree, with 3 species, let's say:

  • a data frame:

    x<-c("species 1","species 2","species 3")
    df<- data.frame(x,y,z)

(with A, B and C being the categorical variables, for instance in my case presence/absence/duplicated gene).

Would you know how to do it?

Many thanks in advance!


ADD COMMENTlink modified 21 months ago by Biostar ♦♦ 20 • written 3.3 years ago by tlorin240

What about A: How Do I Draw A Heatmap In R With Both A Color Key And Multiple Color Side Bars? by Obi Griffith I am using this solution whenever I need to plot a heatmap and a tree. Or are you looking for something else?

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by PoGibas4.7k

Thanks for your answer! Seems really useful indeed. What I do not know is how to choose the color for each category (let's say A=green, B=yellow, C=red) with the heatmap function... But it might easy and I just did not figure it out ^.^

ADD REPLYlink written 3.3 years ago by tlorin240
3.3 years ago by
tlorin240 wrote:

I figured out to do it! Here is my script for those that are interested:


#load packages

#retrieve tree in newick format with 3 species
mytree <- read.tree("sometreewith3species.tre")
mytree_brlen <- compute.brlen(mytree, method="Grafen") #so that branches have all same length

#turn the phylo tree to a dendrogram object
hc <- as.hclust(mytree_brlen) #Compulsory step as as.dendrogram doesn't have a method for phylo objects.
dend <- as.dendrogram(hc)
plot(dend, horiz=TRUE) #check dendrogram face

#create a matrix with values of each category for each species
values<-c(1,2,1,1,3,2)  #some values for the categories (1=A, 2=B, 3=C)
mat <- matrix(values,nrow=3, dimnames=list) #Some random data to plot

#plot the heatmap
heatmap.2(mat, Rowv=dend, Colv=NA, dendrogram='row',col =
          main="Gene presence, absence and duplication in three species")

#legend of heatmap
par(lend=2)           # square line ends for the color legend
legend("topright",      # location of the legend on the heatmap plot
       legend = c("gene absence", "1 copy of the gene", "2 copies"), # category labels
       col = c("red", "green", "yellow"),  # color key
       lty= 1,             # line style
       lwd = 15            # line width


And I don't know how to show the result but it does work ;)


ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by tlorin240
