Midpoint rooting IQTREE newick file moves node support around
1
0
Entering edit mode
10 weeks ago
Luca Arbore ▴ 10

Hello,

I have a Newick tree built with IQTREE2 which I have to root at midpoint and the problem is that using either phangorn::midpoint function or ape::root and then rendering it with ggtree does not fix the problem. I have looked around and understood what the issue is at its core, and I suspect that in my case a lasting problem is that using read.tree instead of read.iqtree will not read the ultrafast bootstrap support values; then I cannot reproduce the solutions I found online. I simply cannot get to the bottom of it...even though the APE package was update to avoid this issue.

the tree, Oyenano_VP3.fasta.treefile:

(Lishui_pangolin_virus:0.7256639399,Shelly_headland_virus:0.6270142136,((((((Gierle_tick_virus:0.2004803328,Jeddah_tick_coltivirus:0.3474474851)100:0.4704029155,Kundal_virus:0.6885144017)75:0.1241162043,Tarumizu_tick_virus:0.8218633021)100:0.2835495753,Oyenano_coltivirus:1.0833972242)85:0.1663221127,Tai_Forest_reovirus:0.9855843970)100:0.9633777986,((Salmon_River_virus:0.0238290668,Colorado_tick_fever_virus:0.0251385092)95:0.0916960395,(Eyach_virus:0.1227069359,California_hare_coltivirus:0.0738599317)67:0.0805930227)100:0.8972989149)100:1.0667916270);

the code:

colti_vp3 <- read.iqtree(here("data/trees/Oyenano_VP3.fasta.treefile"))
colti_vp3@phylo <- phangorn::midpoint(colti_vp3@phylo, node.labels = "support")
colti_vp3@phylo$tip.label <- gsub(colti_vp3@phylo$tip.label, pattern = "_", replacement=" ")

colti_vp3_tree <- ggtree(colti_vp3, size = 1) +
geom_tiplab(aes(label = label), size = 11, align = T) +
geom_nodelab(aes(label = UFboot), hjust = -.5) +
geom_point2(aes(subset=!isTip & UFboot > 89, fill = cut(UFboot, c(0,90))), shape = 16, size = 5) +
geom_treescale(x = 0, y = 11.2, fontsize = 10, linesize = 1, offset = .1) +
coord_cartesian(clip = "off") +
theme(plot.title = element_markdown(size = 31, face = 2),
    plot.margin = margin(.5, 12, 0, 0, "cm")) +
guides(fill = FALSE, color = FALSE, shape = FALSE)

Any help would be greatly appreciated, thank you!

IQTREE phylogeny ggtree phangorn APE • 5.8k views
ADD COMMENT
0
Entering edit mode

When you say its moving the values around, do you mean it is literally moving them within the tree string in the file, or just the visual representation is getting messed up?

Would help if you could share an image of the 'standard' and mid-point rooted trees to compare

ADD REPLY
0
Entering edit mode

What I mean is that some supports values are moved between the nodes in the ggtree rendering. This is a known problem which supposedly was fixed in ape: https://academic.oup.com/mbe/article/34/6/1535/3077051

also: https://github.com/YuLab-SMU/ggtree/issues/89

Problem is that I use read.iqtree from the treeio package to read the tree information, which is not compatible with the rooting function from the other package

Original unrooted is on left, rooted is on right. See the values shift around Tarumizu, Oyenano, Tai Forest

enter image description here

ADD REPLY
0
Entering edit mode
3 days ago
Kevin Blighe ★ 90k

The issue that you are experiencing is caused by the fact that the midpoint rooting function modifies the structure of the phylogenetic tree object, which results in the node identifiers being renumbered. The associated data in the treedata object, such as the ultrafast bootstrap support values, remain linked to the original node identifiers. This causes a mismatch when rendering the tree with ggtree, leading to the support values appearing to shift between nodes.

To resolve this, update the associated data slot after applying the midpoint rooting. The phangorn midpoint function preserves the support values in the node labels of the new tree when the node.labels parameter is set to "support". You can then recreate the data slot using these node labels.

Here is the modified code:

colti_vp3 <- read.iqtree(here("data/trees/Oyenano_VP3.fasta.treefile"))
colti_vp3@phylo <- phangorn::midpoint(colti_vp3@phylo, node.labels = "support")

# Update the data slot with new node identifiers
ntip <- length(colti_vp3@phylo$tip.label)
nnode <- colti_vp3@phylo$Nnode
internal_nodes <- (ntip + 2):(ntip + nnode)
colti_vp3@data <- tibble::tibble(node = internal_nodes, UFboot = as.numeric(colti_vp3@phylo$node.label))

colti_vp3@phylo$tip.label <- gsub(colti_vp3@phylo$tip.label, pattern = "_", replacement = " ")

colti_vp3_tree <- ggtree(colti_vp3, size = 1) +
  geom_tiplab(aes(label = label), size = 11, align = TRUE) +
  geom_nodelab(aes(label = UFboot), hjust = -0.5) +
  geom_point2(aes(subset = !isTip & UFboot > 89, fill = cut(UFboot, c(0, 90))), shape = 16, size = 5) +
  geom_treescale(x = 0, y = 11.2, fontsize = 10, linesize = 1, offset = 0.1) +
  coord_cartesian(clip = "off") +
  theme(plot.title = element_markdown(size = 31, face = 2),
        plot.margin = margin(0.5, 12, 0, 0, "cm")) +
  guides(fill = FALSE, color = FALSE, shape = FALSE)

This approach ensures that the support values are correctly reassigned to the new node identifiers. If your tree includes additional associated data beyond ultrafast bootstrap values, you may need to map those values manually based on clade membership.

Kevin

ADD COMMENT

Login before adding your answer.

Traffic: 3035 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6