Question: A Pwm With Gapped Alignments In Biopython
gravatar for RossCampbell
8.3 years ago by
RossCampbell140 wrote:

I'm trying to generate a Position-Weighted Matrix (PWM) in Biopython from Clustalw multiple sequence alignments. I get a "Wrong Alphabet" error every time I do it with gapped alignments. From reading the documentation, I think I need to utilize the Gapped Alphabet to deal with the '-' character in gapped alignments. But when I do this, it still doesn't resolve the error. Does anyone see the problem with this code, or have a better way to generate a PWM from gapped Clustal alignments?

from Bio.Alphabet import Gapped
alignment ="filename.clustalw", "clustal", alphabet=Gapped)
m = Motif.Motif()
for a in alignment:
clustalw biopython multiple • 3.2k views
ADD COMMENTlink written 8.3 years ago by RossCampbell140

Here's an example how to do it with with unambiguous_dna:

m = motifs.create(list, alphabet=Gapped(IUPAC.unambiguous_dna))

ADD REPLYlink written 4.9 years ago by jykzel0
gravatar for Brad Chapman
8.3 years ago by
Brad Chapman9.5k
Boston, MA
Brad Chapman9.5k wrote:

Paul-Michael's answer is exactly right and you were nearly there with Gapped. Just pass the same alphabet to AlignIO and Motif; here is a working example with a protein alignment:

from Bio import AlignIO, Motif
from Bio.Alphabet import IUPAC, Gapped
alphabet = Gapped(IUPAC.protein)
alignment ="cw02.aln", "clustal", alphabet=alphabet)
m = Motif.Motif(alphabet)
for a in alignment:
print m.pwm()[0]

With a DNA alignment you'd want IUPAC.unambiguous_dna.

ADD COMMENTlink written 8.3 years ago by Brad Chapman9.5k
gravatar for Agapow
8.3 years ago by
London, UK
Agapow270 wrote:

Just quickly: the proximate problem here is that the alphabets of the Motif and and aligned sequence being added have to be the same (see line 44 of So you could create the motif with the same alphabet, or perhaps try to set it to None, which with avoid the alphabet comparison.

Biopython Alphabets are a tricky subject - I regularly have to pore over the source code to understand how they work, only to forget it all by the next time something odd happens.

ADD COMMENTlink written 8.3 years ago by Agapow270

Which lead me to finally writing down all my notes on Alphabets:

ADD REPLYlink modified 5 weeks ago by RamRS24k • written 8.3 years ago by Agapow270
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1242 users visited in the last hour