I had 50-60 bacterial strains of length 1.5Mb (bp average). I want to construct genome tree by doing multiple genome alignment. Please provide me with guidlines/tools.
I had 50-60 bacterial strains of length 1.5Mb (bp average). I want to construct genome tree by doing multiple genome alignment. Please provide me with guidlines/tools.
I get it that there are reasons for whole genome alignments, but making genome trees does not strike me as one of them. Unless all your strains are very related to each other and you are trying to tease out subtle differences, I doubt you will gain much useful information from this exercise.
Instead, there is a tool called ezTree that will look at multiple genomes, predict their ORFs, find single-copy marker genes they all share, and build a tree from their concatenated alignments. The last step can certainly be done better - I do it manually outside of this pipeline - but I think this general approach is more likely to give you useful information.
kmer signature based methods include sourmash
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
It's functionally impossible to multiply-align whole genomes.
You need to explore using
mash
distances, or create clustered, concatenated orthologue alignments.