The NeedlemanWunsch alignment crashes on certain sequences
1
0
Entering edit mode
23 months ago
ma23 ▴ 40

Hi all,

I have two lists of sequences and I need to calculate the similarity for every pair from these two lists. So I apply 'needle' from EMBOSS package. Everything goes fine until this sequence comes across: 'IDMSAYPVESIR'. When it happens the program crashes with the message:

Warning: Sequence 'embl::ref.txt:EMBOSS_001' has zero length, ignored

Error: Unable to read sequence 'seq1.txt'

Died: needle terminated: Bad value for '-asequence' and no prompt


Can anybody tell me, why is it happening? What may cause the problem ?

EMBOSS Alignment • 426 views
0
Entering edit mode

Show us the sequence (header + sequence)

0
Entering edit mode

I execute the command as shown below:

needle seq1.txt seq2.txt -datafile EPAM30MS -gapopen 12 -gapextend 2 output


Each .txt contains only one line that is exactely the sequence. When one of txt presents the sequence 'IDMSAYPVESIR' the troubles begin.

0
Entering edit mode

If it is protein sequence you need to add something like -sprotein1 -sprotein2 http://www.bioinformatics.nl/cgi-bin/emboss/help/needle

EDIT:

To explain, it is possible that this error did not come up before if the previous "amino acids" are valid IUPAC nucleotides

0
Entering edit mode

I've added -sprotein1 -sprotein2 but it didn't get better. So I assume the problem is something else.

2
Entering edit mode
23 months ago
ma23 ▴ 40

Ok, I think I might have solved the problem. The peptide that causes the problem begins with a combination of acids 'ID'. So the command just takes it as a sign that the format of the input is embl. I've changed the call:

needle plain::seq1.txt plain::seq2.txt -datafile EPAM30MS -gapopen 12 -gapextend 2 output


Now it works good.