ggtree and issues with NA
2
0
Entering edit mode
7 months ago
pdhrati02 ▴ 30

Hi all, I am using ggtree and gheatmap to visualize a tree in R. Tree was generate using phylophlan. I generated the tree in the following manner:

tree2 <- read.tree("mytree.tre")

x <- ggtree(tree2, layout="circular")

Load the metadata file

dd=read.table('metadata.tsv', header=T,check.names=FALSE, sep = "\t")

p1 <- gheatmap(x, dd, width=0.2, offset=0.8, color = NA, colnames_angle=90, colnames_offset_y = 0.25) + scale_fill_viridis_d(option="D", name="Phylum")

The phylum column in my metadata has no NAs. However the plot formed shows all phylum names and NAs too. This results in separate lines in my heatmap instead of continous bars (see attached).enter image description here

Can anyone help me with what the issue could be here? Why the NAs are present and how to get rid of them?

Any help would be appreciated, thank you. Best DP

ggtree gheatmap • 368 views
ADD COMMENT
0
Entering edit mode
7 months ago

Evidently, you have species/strains present in your tree for which there is no Phylum annotated in the metadata that you provided. This can be a mapping issue or the data is indeed lacking. I recommend starting the troubleshooting process by checking if indeed all row names of your heatmap match the tip labels of your data: setdiff(rownames(dd),tree2$tip.label).

Minimal working example to reproduce a similar result:

library("ggtree")
library("treeio")
library("ggplot2")

nwk <- system.file("extdata", "sample.nwk", package="treeio")

tree <- read.tree(nwk)
circ <- ggtree(tree, layout = "circular")

df <- data.frame(A=sample(LETTERS[1:6],length(tree$tip.label),replace=TRUE),
                 B=sample(LETTERS[1:6],length(tree$tip.label),replace=TRUE))

rownames(df) <- tree$tip.label
rownames(df)[c(3,7)] <- c("Notatip1","Notatip2")  #include two non-matching tip/row names into the df

p1 <- gheatmap(circ, df, offset=.8, width=.2,
               colnames_angle=95, colnames_offset_y = .25) +
  scale_fill_viridis_d(option="D", name="Phylum")

print(p1)

# which tips are not annotated in the heatmap?
setdiff(rownames(df),tree$tip.label)

Minimal example

ADD COMMENT
0
Entering edit mode
7 months ago
pdhrati02 ▴ 30

Hi , thank you for your quick response. I will have a look at try to use the links you provided. However, my tree when loaded shows 1448 tip labels which is the same as the number of my metadata rows (rows are samples). And each one is for sure classified at phylum level. I will try your suggestions. Thank you very much once again. DP

ADD COMMENT
0
Entering edit mode

Of course, I can't rule out that there is something else going on, but I am still convinced that the most likely explanation is a mismatch between the provided metadata and the tree. Either the tip labels are incorrect (maybe capitalized?) or the row names mangled when reading in your metadata. I have added a minimal working example to my initial reply to better illustrate what I meant.

ADD REPLY
0
Entering edit mode

Hi, thank you very much, I followed your suggestions and realized the issue with annotations and corrected it. Thank you very much once again for your help. Best Dhrati

ADD REPLY

Login before adding your answer.

Traffic: 1467 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6