I would like to use biopython to align some DNA sequences. I got a script from internet and I ran it as follows (~ is the short for my home path):
~$ python Python 3.5.1 (default, Jul 3 2016, 12:57:35) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from Bio.Align.Applications import MuscleCommandline >>> muscle_exe = r"/usr/bin/muscle" >>> in_file = r"~/SpiderOak Hive/LAB/Lab book/Ery/Seqs/tester.fasta" >>> out_file = "~/SpiderOak Hive/LAB/Lab book/Ery/Seqs/tester_aligned.fasta" >>> muscle_cline = MuscleCommandline(muscle_exe, input=in_file, out=out_file) >>> print(muscle_cline) /usr/bin/muscle -in "~/SpiderOak Hive/LAB/Lab book/Ery/Seqs/tester.fasta" -out "~/SpiderOak Hive/LAB/Lab book/Ery/Seqs/tester_aligned.fasta"
but when lauching the application I got:
>>> muscle_cline() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/python3.5/lib/python3.5/site-packages/Bio/Application/__init__.py", line 516, in __call__ stdout_str, stderr_str) Bio.Application.ApplicationError: Non-zero return code 137 from '/usr/bin/muscle -in "/home/gigiux/SpiderOak Hive/LAB/Lab book/Ery/Seqs/tester.fasta" -out "/home/gigiux/SpiderOak Hive/LAB/Lab book/Ery/Seqs/tester_aligned.fasta"', message 'MUSCLE v3.8.31 by Robert C. Edgar' >>>
when trying muscle directly from terminal, without any other applications running, I got:
~/$ muscle -in ery_multiseq.fasta -out ery_multiseq_aligned.fa MUSCLE v3.8.31 by Robert C. Edgar http://www.drive5.com/muscle This software is donated to the public domain. Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97. ery_multiseq 10 seqs, max length 1876490, avg length 1276570 00:00:42 98 MB(-7%) Iter 1 100.00% K-mer dist pass 1 00:00:42 98 MB(-7%) Iter 1 100.00% K-mer dist pass 2 Killed43 608 MB(-41%) Iter 1 11.11% Align node
What would be the issue? I have seen on internet that it might be due to memory problems; in that case is the code OK? and how could I run large alignments?
Code 137 is most likely a memory problem; the file I am using is 13 Mb overall and I have 15 Gb of RAM.
If I don't have enough memory, how can I extend it?
If it possible that what I consider a small genomic work could consume so much memory?
Is there a more efficient aligner than muscle? MAFTT for instance?