Fasta with multiple sequences to alignment object which can by used to build a phylogenetic tree
Entering edit mode
2.4 years ago
mac03pat ▴ 30

I attempting to take a single fasta file with multiple sequences of variable length as input and output aligned sequenes that I can use to build a phylogenetic tree with biopython phylo.

Here's my file:

Things I've tried:

from Bio import AlignIO'extracted_KS_with_taxa.fa'), 'fasta')

^ Doesn't work for sequences of unequal length

from Bio.Align.Applications import MuscleCommandline

cline = MuscleCommandline(input='extracted_KS_with_taxa.fa', out='aligned_KS.aln', clwstrict=True)

^ Didn't output a file

from Bio.Align.Applications import MuscleCommandline
muscle_cline = MuscleCommandline(input='extracted_KS_with_taxa.fa')
stdout, stderr = muscle_cline()
from StringIO import StringIO
from Bio import AlignIO
align =, 'fasta')

^ Returned this error:

Traceback (most recent call last):
  File "C:\Users\mac03\AppData\Local\Programs\Python\Python37\MBSProject\", line 20, in <module>
    stdout, stderr = muscle_cline()
  File "C:\Users\mac03\AppData\Local\Programs\Python\Python37\lib\site-packages\Bio\Application\", line 527, in __call__
    stdout_str, stderr_str)
Bio.Application.ApplicationError: Non-zero return code 1 from 'muscle -in extracted_KS_with_taxa.fa', message "'muscle' is not recognized as an internal or external command,"
alignment biopython phylo clustal fasta • 1.9k views
Entering edit mode
2.4 years ago
Joe 19k

In your first case, I think the problem here is that you’re trying to use AlignIO to read a fasta of sequences, not an alignment (if I understand your data correctly).

AlignIO is specifically for reading formats of pre-aligned data, whereas SeqIO is what you need for reading basic sequence data.

Secondly, print(cline) doesn’t do anything, because thats just the commandline itself, not the result of the alignment. You first need to run muscle, which is what BioPython is doing (you also need it installed).

The fact that you don’t have muscle installed already, is why your last command is failing, because Biopython is shell-ing out to run muscle on the commandline, but doesn’t recognise the command, because there’s no corresponding installed binary for muscle.

I suggest you look closely at the BioPython Tutorial, as there are a good many things you’ve got mixed up here.


Login before adding your answer.

Traffic: 1843 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6