Differently sized sequences in the clustal omega alignment
1
0
Entering edit mode
2.6 years ago

Hello, I used clustal omega software to align 401 sequences mitochondrial DNA. However, I had difficulty understanding the manual and the input options. But still I used the following command:

clustalo -i teste_rn -o teste --threads=16 --outfmt=clustal -v --force


 FORCED DEBUG: Potential Problem: sequences (N = 401) don't have same lengths but contain gaps, consider using --dealign


I used the initial command using input = i. I tried to understand the recommended option (--dealign) but I didn't understand how much it could affect my alignment. As I understand it it removes the gaps and standardizes them all to the same size. That's right? Wouldn't that hurt my result? The sequences may have different sizes because each patient has different indels and in the end the different size is normal. I thought using --dealign this would not be considered and my result would be masked. Can someone help me? what is the best option for me?

alignment clustalo • 1.3k views
0
Entering edit mode

wild shot: did you check if one of your input sequences does not contain a 'gap' (or some weird character ) ?

0
Entering edit mode

thank you very much, thanks for your comment I understood and redone right

0
Entering edit mode
2.6 years ago
Mensur Dlakic ★ 18k

Clustal Omega, like most alignment programs, wants unaligned (raw) fasta sequences to begin with. If you feed it an alignment, which is what seems to be the case, there will likely be sequences with gaps. That is why you are suggested to --dealign them first and bring them to plain FASTa files. It doesn't matter if they have the same length or not after you de-align them - that is never a requirement for alignment programs.

In short: make sure that your sequences are in proper FASTa format before feeding them to ClustalO, or use --dealign as suggested.