Solved: I stopped using clustalo and went back to mafft (originally what I wanted to use) but it was unable to read my alignments. This was due to non-typical characters that were inserted in my exon alignments ("!" and "?"), but after converting those into dashes, mafft read the alignment properly and I was able to append my new species while maintaining the reading frame of my original MSA. For those who may need to do the same, I used (windows, cygwin):
$ mafft --addfull outgroup_species.fasta --keeplength prealigned_msa.fasta > combined_msa.fasta
I need to align a new sequence to a pre-existing multiple sequence alignment. I know how to run clustalo profile-profile alignment where I treat my one new sequence as a separate alignment. But everytime I run this process, the pre-existing MSA gets gaps added between columns but I need to avoid this as it is ruining my reading frame.
Is there an option to simply not alter the first profile alignment at all?
Sample of my pre-aligned MSA (if I were looking at the first 4 exons):
>sp1 ----ATGCTC---ATAT >sp2 ----ATGGTC---ATAT >sp3 CCAT---------ATAT # These gaps are inserted to represent a missing exon >sp4 CCATATGGTCCCC---- # Gaps needed to maintain the reading frame per exon
The sequence I want to add to the pre-aligned MSA (it has some extra bases that I show with () that need to be trimmed after aligned; all exons included as this is a reference sequence):
>sp1 ----ATGCTC---ATAT >sp2 ----ATGGTC---ATAT >sp3 CCAT---------ATAT >sp4 CCATATGGTCCCC---- >outgroup CCATATGGTCCCCATAT
Not sure how to align the new sequence while maintaining the length of the MSA, because if it does add columns it will mess up the reading frame. I cannot convert the bases to amino acids either because I will have to work in nucleotides for future dN/dS ratios.