R phangorn phylogenetic analysis from somatic mutations' binary table
3
2
Entering edit mode
8.2 years ago

I want to use a binary matrix to build a phylogenetic tree using the R package phangorn as done in this paper:Tracking the Genomic Evolution of Esophageal Adenocarcinoma through Neoadjuvant Chemotherapy

In the methods they say:

Trees were built using binary presence/absence matrices built from the regional distribution of variants within the tumor. The R Bioconductor package phangorn (1.99-7; ref. 36) was utilized to perform the parsimony ratchet method (18), generating unrooted trees. Branch lengths were determined using the acctran function.

I have the binary presence/absence matrix, however the phangorn package uses phyDat objects, which are derived from sequence alignments according to the phangorn vignettes.

My question is:

How can I use a binary table to build a phylogenetic tree with the R phangorn package?

If there is a way to read the binary matrix as a phyDat object, that would solve the problem, but I don't see how that could be done.

phangorn R tumour phylogeny binary table • 6.7k views
ADD COMMENT
4
Entering edit mode
8.2 years ago
Klaus S ▴ 150

Hello Alejandro,

There are generic functions as.phyDat() in phangorn to transform matrices and data.frames into phyDat objects.

For example you can read in your data with read.table() or read.csv(), but you might need to transpose your data. For matrices as.phyDat() assumes that the entries each row belongs to one individual (taxa), but for data.frame each column. For binary data you can transform these with a command like (depending how you coded them):

as.phyDat(data, type="USER", levels = c(0, 1))
as.phyDat(data, type="USER", levels = c(TRUE, FALSE))

Regards,
Klaus

ADD COMMENT
3
Entering edit mode
8.2 years ago
poisonAlien ★ 3.2k

There are multiple ways to construct trees based on binary data.

  1. You can use neighbor-joining method from Phangorn for tree construction.

    Since you already have a binary matrix.

    mat.nj = nj(dist.gene(t(mat))) #neighbour joining tree construction
    plot(mat.nj, 'cladogram') #plot cladogram
    write.tree(mat.nj, 'mat.newick') #write newick tree
    

    This is using UPGMA method (I'm not sure you can use this one for binary data)

    mat.upgma = upgma(dist.gene(t(mat)))
    
  2. Use Phylip character parsimony which I think most suitable for this kinda data.

    Collapse your matrix for pars input. For example:

    4 10 #Four samples ten mutations
    tumor_right
    11110
    tumor_left
    11111
    tumor_up
    11001
    tumor_root
    00000
    

    And use phylip pars with outgroup root set to your germline sample (here its tumor_root, 4th sample). Setting outgroup in Phangorn is bit difficult (I'm not sure though).

  3. As Chris suggested above, you can use other sophisticated methods such as lichee, which uses vaf info to cluster and constructs trees (also divides trees based on clones).

ADD COMMENT
0
Entering edit mode

Hello Alien! Do you know how to draw a rooted tree by phangorn?That is to say,how can I set the group(0,0,0,0,0) to be the root? Regards Tang

ADD REPLY
1
Entering edit mode
8.2 years ago

An alternative solution, and a more typical workflow for cancer samples, would be to feed your VAF and clustering information into a package like clonevol, which does the phylogenetic inference and produces some nice visualizations. (clustering can be accomplished with a package like sciclone or pyclone).

ADD COMMENT

Login before adding your answer.

Traffic: 1942 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6