Dear all, Please, if I have multiple sequence alignments for parts of a virus genome; how can I end up with phylogenetic tree (circular tree) showing bootstrap & p distance on it? Also how can I plot the p distance distribution? Many Thanks in advance
There is a lot of information on the World Wide Web about building phylogenetic trees, but it's all quite sparse, meaning that you'll have to piece it all together to get what you want.
This first tutorial may be of use to you in order to help you to get from your FASTA sequences to a clustering object / tree structure: Using R to Caculate Genetic Distance and Generate Phylogenetic tree. It utilises the ape package in R. NB - the typeo in the title is not my own.
Building circular dendrograms / phylogenetic trees
If you want more information on building circular dendrograms, take a look at this Biostars thread, to which I have contributed: A: how to draw circular dendrogram with distance information There are also many pages on the WWW that provide information on how to build these in various ways.
Bootstrap the tree structure
If you want to bootstrap the tree structure (dendrogram) and derive P values for the branching, then use PVclust. Again, here's a Biostars thread to which I have contributed: A: how to make bootstrapped tree in PVCLUST package with SNP genotyping data?
P value distribution
If you want to plot the P value distributions, take a look here: DESeq2 unequal sample sizes
Finally, in order to piece all of this together in a single figure, make the use of mfrow in
plot(), or the cowplot or gridExtra packages.
Hopefully this helps you somewhat.