Anyone can help me to interpret this ClonalFrameML output?
14 months ago

Hi! I have executed the ClonalFrameML software for a tree with different species of the Mycobacterium genus.

However the output is not explained and I don't know if with this results I can trust my tree or what should I do (for example continue my analysis with the reconstructed tree).

Can anyone help me?

Thanks!

I paste here the importation_status.txt file:

Node    Beg End
A.subflavus_prokka.gff  4767    4982
A.subflavus_prokka.gff  6678    6933
A.subflavus_prokka.gff  7374    7716
abscessus_prokka.gff    1   8025
avium_prokka.gff    1   8025
canettii_prokka.gff 1605    1959
chitae_prokka.gff   1   8025
confluentis_prokka.gff  1   8025
fallax_prokka.gff   1   8025
fortuitum_prokka.gff    1   8025
gilvum_prokka.gff   1   7623
kansasii_prokka.gff 1   8025
leprae_prokka.gff   1   8025
rutilum_prokka.gff  1   8025
smegmatis_prokka.gff    6690    6885
smegmatis_prokka.gff    7479    7689
vanbaalenii_prokka.gff  1   8025
NODE_21 1   615
NODE_22 1   8025
NODE_23 1   498
NODE_23 840 2896
NODE_23 3885    4728
NODE_23 5193    6420
NODE_23 7869    8025
NODE_26 1   8025
NODE_27 1   1023
NODE_27 2568    8025
NODE_28 1   8025
NODE_29 1   8025
NODE_30 1   510
NODE_30 1146    4987
NODE_30 5475    8025
NODE_31 1   1152
NODE_31 2154    2931
NODE_31 3495    3842
NODE_31 4377    5023
NODE_31 5472    7488
NODE_32 2084    2640
NODE_32 4171    4185
NODE_32 5464    5510
NODE_32 7848    8025
NODE_33 1914    2021
NODE_33 6066    6556
NODE_33 7896    8025
NODE_34 1   8025
NODE_35 2350    2517
NODE_35 6098    6297
NODE_36 2449    2534
NODE_36 6045    6519
NODE_37 1   564
NODE_37 1395    2928
NODE_37 3378    6831
NODE_37 7551    8025

Recombination ClonalFrameML Phylogenies
Have you looked at the plots you can produce from the output? These will probably help you understand the output.

It's been a while since I used it, but IIRC, that file tells you the intervals of sequence which are likely horizontally acquired.

Yes, the plot that produces is this one. As I know: Blue horizontal lines are recombination events identified by ClonalFrameML. Short vertical lines are substitutions, non-homoplasies in white, homoplasies in yellow-red (increasing number of homoplasies at that site).

But that means that some species are totally recombination events in my tree? Do you know if I can trust my tree? How would you present this in a paper?

If you look, more or less all of the full blue lines correspond to some deep inferred node (i.e. I think this is essentially trying to do some ancestor reconstruction). I probably wouldn't pay a great deal of attention to this, and maybe focus on the clearer signal; e.g. in the top right you have a region between about 155000 and 180000 which looks to have quite a clear recombination signal.

This isn't really my area of major expertise, but I would perhaps look at some other papers that used CFML and look at what they've done and been able to infer.

Hi There How did you use importation_status.txt output in tree? Is there any command or R script for this?

There is an R script included with CFML to create the plots above from that file