Question: Heatmap with categorical variables and with phylogenetic tree in R or Python
2
gravatar for tlorin
3.3 years ago by
tlorin240
Switzerland
tlorin240 wrote:

Hi everyone! :)

I have a question and did not find any answer by personal search. I would like to make a heatmap with categorical variables (a bit like this one: heatmap-like plot, but for categorical variables ), and I would like to add on the left side a phylogenetic tree (like this one : how to create a heatmap with a fixed external hierarchical cluster ). The ideal would be to adapt the second one since it looks much prettier! ;)

Here is my data:

  • a newick-formatted phylogenetic tree, with 3 species, let's say:

    ((1,2),3);
  • a data frame:

    x<-c("species 1","species 2","species 3")
    y<-c("A","A","C")
    z<-c("A","B","A")
    df<- data.frame(x,y,z)

(with A, B and C being the categorical variables, for instance in my case presence/absence/duplicated gene).

Would you know how to do it?

Many thanks in advance!

 

ADD COMMENTlink modified 21 months ago by Biostar ♦♦ 20 • written 3.3 years ago by tlorin240

What about A: How Do I Draw A Heatmap In R With Both A Color Key And Multiple Color Side Bars? by Obi Griffith I am using this solution whenever I need to plot a heatmap and a tree. Or are you looking for something else?

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by PoGibas4.7k

Thanks for your answer! Seems really useful indeed. What I do not know is how to choose the color for each category (let's say A=green, B=yellow, C=red) with the heatmap function... But it might easy and I just did not figure it out ^.^

ADD REPLYlink written 3.3 years ago by tlorin240
3
gravatar for tlorin
3.3 years ago by
tlorin240
Switzerland
tlorin240 wrote:

I figured out to do it! Here is my script for those that are interested:

 

#load packages
library("ape")
library(gplots)

#retrieve tree in newick format with 3 species
mytree <- read.tree("sometreewith3species.tre")
mytree_brlen <- compute.brlen(mytree, method="Grafen") #so that branches have all same length

#turn the phylo tree to a dendrogram object
hc <- as.hclust(mytree_brlen) #Compulsory step as as.dendrogram doesn't have a method for phylo objects.
dend <- as.dendrogram(hc)
plot(dend, horiz=TRUE) #check dendrogram face

#create a matrix with values of each category for each species
a<-mytree_brlen$tip
b<-c("gene1","gene2")
list<-list(a,b)
values<-c(1,2,1,1,3,2)  #some values for the categories (1=A, 2=B, 3=C)
mat <- matrix(values,nrow=3, dimnames=list) #Some random data to plot

#plot the heatmap
heatmap.2(mat, Rowv=dend, Colv=NA, dendrogram='row',col =
            colorRampPalette(c("red","green","yellow"))(3),
          sepwidth=c(0.01,0.02),sepcolor="black",colsep=1:ncol(mat),rowsep=1:nrow(mat),
          key=FALSE,trace="none",
          cexRow=2,cexCol=2,srtCol=45,
          margins=c(10,10),
          main="Gene presence, absence and duplication in three species")

#legend of heatmap
par(lend=2)           # square line ends for the color legend
legend("topright",      # location of the legend on the heatmap plot
       legend = c("gene absence", "1 copy of the gene", "2 copies"), # category labels
       col = c("red", "green", "yellow"),  # color key
       lty= 1,             # line style
       lwd = 15            # line width
)

 

And I don't know how to show the result but it does work ;)

 

ADD COMMENTlink modified 3.3 years ago • written 3.3 years ago by tlorin240
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 972 users visited in the last hour