Question

Two questions about the MAFFT alignment tool.

0

Entering edit mode

4.0 years ago

josh.singer • 0

Two questions about MAFFT.

(a) Suppose I have an unaligned nucleotide multi-fasta file A, with no gap ('-') characters, but it might contain any of the other IUPAC nucleotide characters. I align it using MAFFT, default options, this gives multi-fasta file B, which obviously may contain gaps. Suppose I then strip out all the gap characters to give file C. Is it guaranteed that A and C will be the same? (allowing for trivial differences in sequence order and upper/lower case).

(b) Suppose I have two unaligned nucleotide files A1 and A2. I align A1 with MAFFT, default options to give B1. I then align B1 and A2 using MAFFT --add, with A2 passed in as the "new sequences". This gives B2. Separately, I concatenate A1 and A2 to give A3, then align A3 using MAFFT with default options to give B3. Are B2 and B3 "algorithmically" equivalent? i.e. the only differences would be down to things like arbitrary stochastic choices.

josh

mafft alignment • 976 views

ADD COMMENT • link updated 4.0 years ago by Biostar 20 • written 4.0 years ago by josh.singer • 0

score 1 · Answer 1 · 2020-04-11

1

Entering edit mode

4.0 years ago

Mensur Dlakic ★ 27k

a) Yes

b) Not sure, but probably not. In your first example, B1 is used as a guide to align A2. In your second example, all sequences would be aligned together. As long as A1 and A2 are comparable I would not expect B2 and B3 to be wildly different, but I would not expect them to be identical either.

ADD COMMENT • link 4.0 years ago by Mensur Dlakic ★ 27k