I am trying to do a core genome phylogeny of 8 different species of a bacteria. I performed blastclust to make the clusters of homologous proteins. one of the clusters contains two copies of same protein from one species. I want to ask if I should keep one copy in the cluster and make allignment or should I discard that cluster completely from the allignment.
And I would like to ask one more thing, is there some length threshold for the protein sequences to make allignment or we can take the shorter sequences like 50 residues also.