Clustalo failing to complete multiple sequence alignments
1
0
Entering edit mode
2.1 years ago
Laura ▴ 50

Hi,

I have been using clustalo in the terminal to create multiple sequence alignments for the past few weeks, but every time I try lately it fails. I've tried on two computers, with different files. I've tried looking up the errors but haven't found anything. Does anyone know what this means, or what might be going on?

Thanks in advance!

Here's my command:

clustalo --in=fl_L1PA2.fa --out=L1PA2.aln --force --outfmt=clustal --wrap=175 --threads=6 --verbose

And here's the output. It goes along fine through the first few parts, and then..

Using 6 threads
Read 914 sequences (type: DNA) from fl_L1PA2.fa
Using 96 seeds (chosen with constant stride from length sorted seqs) for mBed (from a total of 914 sequences)
Calculating pairwise ktuple-distances...
Ktuple-distance calculation progress done. CPU time: 1035.86u 2.24s 00:17:18.09 Elapsed: 00:03:29
mBed created 31 cluster/s (with a minimum of 1 and a soft maximum of 100 sequences each)
Distance calculation within sub-clusters done. CPU time: 387.80u 0.50s 00:06:28.30 Elapsed: 00:01:18
Guide-tree computation (mBed) done.
HHalignWrapper:hhalign_wrapper.c:1419: problem in alignment (profile sizes: 1 + 1) (chr1_142876163-142882314 + chr1_223568877-223574899), forcing Viterbi
        hh-error-code=4 (mac-ram=8000)
hhalign:hhalign.cpp:961: Problem Reading/Preparing profiles (len(q)=0/len(t)=0)
HHalignWrapper:hhalign_wrapper.c:1447: problem in alignment, Viterbi did not work
        hh-error-code=4 (mac-ram=64000)
hhalign:hhalign.cpp:961: Problem Reading/Preparing profiles (len(q)=0/len(t)=0)
FATAL: could not perform alignment -- bailing out
msa clustalo • 1.2k views
ADD COMMENT
1
Entering edit mode

We can see 914 sequences, but you didn't tell us how long they are - at least on average. It would appear that some of them are long - at least that is what forces Viterbi with proteins.

Do any of your sequences have a length of zero, meaning just the header line?

If this is truly a DNA sequence, it might be a good idea to specify --seqtype=DNA.

ADD REPLY
0
Entering edit mode
2.1 years ago
Laura ▴ 50

This ended up being an issue between hg19 and hg38 builds. My, oh my.

ADD COMMENT
0
Entering edit mode

Hi, I know this is an old thread, but I am encountering a similar problem. I am trying to do an msa for the cds sequences of 5 bumblebee species. I get a similar error: HHalignWrapper:hhalign_wrapper.c:1356: problem in alignment (profile sizes: 10297 + 17 0000007628.1 + ENSBTST00005022444.1), forcing Viterbi hh-error-code=3 (mac-ram=8000)

Can I know what was the issue between the different builds and how did you fix it? Cheers

ADD REPLY

Login before adding your answer.

Traffic: 2936 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6