I am performing pairwise global alignment using Emboss Needleman-Wunsch algorithm via python script. The script runs pretty well with shorter sequences but it throws an error when I perform with a pair of proteins (the longest protein Titin). I am trying to perform pairwise global alignment of ensembl protein ENSP00000343764 and SwissProt protein Q8WZ42. The length of these two sequences are not same, so I am interested to see the alignment. I am using python to perform this alignment. The code I used is:
from Bio.Emboss.Applications import NeedleCommandline from Bio import AlignIO needle_cline=NeedleCommandline(asequence="Q8WZ42.fa",bsequence="ENSP00000343764.fa",outfile="ENSP00000343764.needle",gapopen=10,gapextend=0.5) stdout,stderr=needle_cline()
This generates an error:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/pymodules/python2.7/Bio/Application/__init__.py", line 437, in __call__ stdout_str, stderr_str) Bio.Application.ApplicationError: Command 'needle -outfile=ENSP00000343764.needle -asequence=Q8WZ42.fa -bsequence=ENSP00000343764.fa -gapopen=10 -gapextend=0.5' returned non-zero exit status 1, 'Needleman-Wunsch global alignment of two sequences'
If I use only a small fragment (say 5000 amino acid) of any one of the sequences, the script works. It generates an alignment file. I am not sure, if the error is because of the length of the proteins. Can anyone explain the possible reason for this error and how to fix it? I might use fragments of the sequences to see the alignment but thats not a good idea when my script is running for large number of proteins. Do you have any idea how I can do it?
Thanks in advance!