Question: Fasta with multiple sequences to alignment object which can by used to build a phylogenetic tree
gravatar for mac03pat
23 days ago by
mac03pat10 wrote:

I attempting to take a single fasta file with multiple sequences of variable length as input and output aligned sequenes that I can use to build a phylogenetic tree with biopython phylo.

Here's my file:

Things I've tried:

from Bio import AlignIO'extracted_KS_with_taxa.fa'), 'fasta')

^ Doesn't work for sequences of unequal length

from Bio.Align.Applications import MuscleCommandline

cline = MuscleCommandline(input='extracted_KS_with_taxa.fa', out='aligned_KS.aln', clwstrict=True)

^ Didn't output a file

from Bio.Align.Applications import MuscleCommandline
muscle_cline = MuscleCommandline(input='extracted_KS_with_taxa.fa')
stdout, stderr = muscle_cline()
from StringIO import StringIO
from Bio import AlignIO
align =, 'fasta')

^ Returned this error:

Traceback (most recent call last):
  File "C:\Users\mac03\AppData\Local\Programs\Python\Python37\MBSProject\", line 20, in <module>
    stdout, stderr = muscle_cline()
  File "C:\Users\mac03\AppData\Local\Programs\Python\Python37\lib\site-packages\Bio\Application\", line 527, in __call__
    stdout_str, stderr_str)
Bio.Application.ApplicationError: Non-zero return code 1 from 'muscle -in extracted_KS_with_taxa.fa', message "'muscle' is not recognized as an internal or external command,"
ADD COMMENTlink modified 22 days ago by jrj.healey12k • written 23 days ago by mac03pat10
gravatar for jrj.healey
22 days ago by
United Kingdom
jrj.healey12k wrote:

In your first case, I think the problem here is that you’re trying to use AlignIO to read a fasta of sequences, not an alignment (if I understand your data correctly).

AlignIO is specifically for reading formats of pre-aligned data, whereas SeqIO is what you need for reading basic sequence data.

Secondly, print(cline) doesn’t do anything, because thats just the commandline itself, not the result of the alignment. You first need to run muscle, which is what BioPython is doing (you also need it installed).

The fact that you don’t have muscle installed already, is why your last command is failing, because Biopython is shell-ing out to run muscle on the commandline, but doesn’t recognise the command, because there’s no corresponding installed binary for muscle.

I suggest you look closely at the BioPython Tutorial, as there are a good many things you’ve got mixed up here.

ADD COMMENTlink written 22 days ago by jrj.healey12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1455 users visited in the last hour