Biopython Bio.motifs: How to create a motif object with aligned sequences
1
0
Entering edit mode
21 months ago
kinetic • 0

I'm following this Biopython tutorial. Where the tutorial uses DNA "instances" to create a motif, I need to use an aligned fasta.

I tried

alphabet = Gapped(IUPAC.protein) 
alignment = AlignIO.read("my_seqs.afa", "fasta", alphabet=alphabet)
m = motifs.create(alignment)

but that results in

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/m/.local/lib/python2.7/site-packages/Bio/motifs/__init__.py", line 24,
in create
return Motif(instances=instances, alphabet=alphabet)
File "/home/m/.local/lib/python2.7/site-packages/Bio/motifs/__init__.py", line 273
, in __init__
counts = self.instances.count()
File "/home/m/.local/lib/python2.7/site-packages/Bio/motifs/__init__.py", line 220
, in count
for letter in self.alphabet:
TypeError: 'NoneType' object is not iterable

I looked through the documentation, but I can't find anything that specifies the expected input for motifs.create. Does it not work with aligned sequences or am I just reading them in incorrectly?

python motif alignment • 1.1k views
ADD COMMENT
0
Entering edit mode
21 months ago
Eric Lim ★ 1.7k

The link you referenced above indicates Bio.motifs.create takes a list of Seq instances. AlignIO gives you a list of SeqRecord and each of those records has a Seq object. So, [x.seq for x in alignment] is what you need to provide in motifs.create.

See below for a working example.

[~/Downloads/tmp]$ cat test.fa 
>1
AGCTAGCG
>2
GTCGAGCC
>3
GTAGCGCG

[~/Downloads/tmp]$ ipython
In [1]: from Bio import AlignIO                                                                                                                     

In [2]: from Bio import motifs                                                                                                                      

In [3]: alignment = AlignIO.read("test.fa", "fasta")                                                                                                

In [4]: m = motifs.create([x.seq for x in alignment])                                                                                               

In [5]: m.consensus                                                                                                                                 
Out[5]: Seq('GTCGAGCG', IUPACUnambiguousDNA())

In [6]: m.counts                                                                                                                                    
Out[6]: 
{'G': [2, 1, 0, 2, 0, 3, 0, 2],
 'A': [1, 0, 1, 0, 2, 0, 0, 0],
 'T': [0, 2, 0, 1, 0, 0, 0, 0],
 'C': [0, 0, 2, 0, 1, 0, 3, 1]}
ADD COMMENT

Login before adding your answer.

Traffic: 2444 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6