Hi All,
I am trying to do a core genome phylogeny of 8 different species of a bacteria. I performed blastclust to make the clusters of homologous proteins. one of the clusters contains two copies of same protein from one species. I want to ask if I should keep one copy in the cluster and make allignment or should I discard that cluster completely from the allignment.
And I would like to ask one more thing, is there some length threshold for the protein sequences to make allignment or we can take the shorter sequences like 50 residues also.
Please help!!
Thanks 5heikki, both copies are exactly similar so then I think I can use any one of them.