Question: Phylogenetic Analysis of gene sequences with Non-synonymous mutations
I have a large number of complete genomes (downloaded from NCBI) related to the same bacterial species. As it is mentioned in the guidelines of MUSCLE, I have already used Usearch for clustering (Uclust) and divided my data to different gene clusters. I have used MEGA software on Windows. But there are two problems in gene cluster of my interest:

  1. gene sequences are not of equal length in the same cluster.
  2. a few gene sequences in that cluster have an end codon in the start or mid of the sequences. (due to non-synonymous mutations)


  1. If I remove those sequences with end codons in the start or mid, actually I would be removing those genomes from my study and I don't want that. Is it possible to solve this issue without excluding those genomes?

  2. Is it possible for MUSCLE/Usearch to replace the gaps or end codons with other alphabets just to equal the aligned sequences?

What I need:

  1. MUSCLE should finish the alignment and results should not be disturbed for that gene cluster due to above-mentioned problems.
  2. the resulting alignments should be of equal length.

PS: I have tried MUSCLE in MEGA 7 software. I have not tried the command line version. If a solution exists in the command line version, I can try that as well.


snp alignment assembly genome • 114 views
Muscle is working fine if run manually and not in MEGA. Thanks anyways

