Multiple sequence alignment of sequenecs of different length
7.9 years ago
rosies • 0

Hi there,

I have the consensus sequence for 48 strains that I have mapped and aligned to a reference using CLC. They are each of approx 3Mbp (about 100bp difference in lengths between the sequences). I am trying to perform a time divergence analysis, but before that I need to format my sequences so that recombination has been considered and the alignments are all of the same length.

I would like to know if there is any software that would perform a multiple sequence alignment across the 48 strains, and remove positions where there is little or no coverage in at least one of the 48 strains, and that handles indels.

I would like this software to produce 48 sequences of equal length so that they may be fed into other software such as Gubbins (detect recombination), then Beast (time divergence). I have tried to use GBlock, but this software requires sequences to be of the same length.

I look forward to hearing your response and ideas.

Thank you and kind regards,

Rosie

7.9 years ago
arnstrm ★ 1.8k

You can use any multiple alignment programs for aligning all the sequences. I recommend MUSCLE (as I am aware of most of the options, nothing else special with this aligner). It is a command line program and has to be run on terminal (linux/mac) or command prompt (windows). You can also use web-server here (there might be some limitation on how much you can align). Basically,

muscle -in input.fasta -out ouput.aln


After this, you can use trimal for trimming the alignment (to make all the aligned sequence of same length). Again on terminal:

trimal -in alignment_file -out output_file -phylip -automated1


Here, -phylip is for specifying the output format (if you plan on using phylip), you can also choose other options as well. The -automated1 option, will decide the optimum parameters for trimming so that it don't remove useful information form the alignment.

I hope this helps!

Thank you arnstrm, this worked a treat!

