Two questions about the MAFFT alignment tool.
1
0
Entering edit mode
4.0 years ago

Two questions about MAFFT.

(a) Suppose I have an unaligned nucleotide multi-fasta file A, with no gap ('-') characters, but it might contain any of the other IUPAC nucleotide characters. I align it using MAFFT, default options, this gives multi-fasta file B, which obviously may contain gaps. Suppose I then strip out all the gap characters to give file C. Is it guaranteed that A and C will be the same? (allowing for trivial differences in sequence order and upper/lower case).

(b) Suppose I have two unaligned nucleotide files A1 and A2. I align A1 with MAFFT, default options to give B1. I then align B1 and A2 using MAFFT --add, with A2 passed in as the "new sequences". This gives B2. Separately, I concatenate A1 and A2 to give A3, then align A3 using MAFFT with default options to give B3. Are B2 and B3 "algorithmically" equivalent? i.e. the only differences would be down to things like arbitrary stochastic choices.

josh

mafft alignment • 976 views
ADD COMMENT
1
Entering edit mode
4.0 years ago
Mensur Dlakic ★ 27k

a) Yes

b) Not sure, but probably not. In your first example, B1 is used as a guide to align A2. In your second example, all sequences would be aligned together. As long as A1 and A2 are comparable I would not expect B2 and B3 to be wildly different, but I would not expect them to be identical either.

ADD COMMENT

Login before adding your answer.

Traffic: 2070 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6