Simple Fasta Parsing Is Not Simple.
4
3
Entering edit mode
11.6 years ago
Vlinxify ▴ 120

When trying out the examples from chapter 2.3 of the biopython 1.54b tutorial I keep running into this very annoying problem: When I use:

from Bio import SeqIO

for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"):

print seq_record.id
print repr(seq_record.seq)
print len(seq_record)


The Python interpreter tells me I should use a handle in stead of filenames. I use biopython 1.53 and not 1.54b for which this tutorial was meant. I can't find an older archived version of the tutorial I could use to help me learn how to use a handle in stead of the simple filename method (ls_orchid.fasta in this case.). I know that in older versions of biopython you have to do everything in handlers but not in the newer versions.

The new tutorial now on-line has an obscure last chapter on how to use handlers but that's hardly helpfull.

I think the tutorial is great, I just havent got the right version installed. I use Ubuntu and downloaded biopython via ubuntu software center.

Can anyone help me use a handler and get the above parsing example working in 1.53?

Thank you,

python biopython fasta file • 3.8k views
5
Entering edit mode
11.6 years ago

try the following

f is a filehandle that is associated with your file of interest

from Bio import SeqIO

f = open("ls_orchid.fasta")
for seq_record in SeqIO.parse(f, "fasta"):
print seq_record.id
print repr(seq_record.seq)
print len(seq_record)
f.close()

0
Entering edit mode

It worked! So this is how handlers work. I remember somewhere I read that you always have to close the handler too, so I know why u use the last statement too. You wrap it and then you unwrap it.

0
Entering edit mode

Closing handles is particularly important for output, but good practice on input too. When the handle goes out of scope and gets garbage collected it will be closed automatically.

5
Entering edit mode
11.6 years ago

Hi, Just a very tiny addition to Gurado's answer, to make the thing more beautiful in Python.

from Bio import SeqIO

with open("ls_orchid.fasta") as f:
for seq_record in SeqIO.parse(f, "fasta"):
print seq_record.id
print repr(seq_record.seq)
print len(seq_record)


Using this form, you don't have to manually close the handler (or forget to do so!). The file connection is closed automatically when you get out of the 'with open()' portion. Cheers!

1
Entering edit mode

The 'with' statement is elegant.. but, may be this is my pet peeve, I try to avoid using it -- simply because it adds one more level of indentation.

0
Entering edit mode

I LOVE using the with object nowadays. I really wonder how I ever lived without it.

0
Entering edit mode

That won't work on Python 2.4 (which Biopython still supports for now)

4
Entering edit mode
11.6 years ago

The tutorial you are reading describes a feature that has been introduced in biopython 1.54.

With all the earlier versions, SeqIO required a file-like handler as the first argument, so you had to do something like SeqIO.parse(open("seq.fasta", "r"), "fasta") as explained in the other answers.

Starting from biopython 1.54, SeqIO.parse can accept either a file handler or a filename as argument, so SeqIO.parse("ls_orchid.fasta", "fasta") works correctly.

I'll notify the biopython's devs to add a note to this in the current tutorial, it may be useful to somebody else.

0
Entering edit mode

Nice clear answer Giovanni :)

I've just clarified the Tutorial's FAQ to point to the appendix on handles, where I've added a couple of examples.

0
Entering edit mode

That one bit me too a few weeks ago I admit..

0
Entering edit mode

You are welcome :-)

2
Entering edit mode
11.6 years ago
Peter 6.0k

Others have already shown some explicit examples of how to use a handle - and the tutorial has been updated to try and make this clearer for Biopython 1.54.

You also asked where to find older copies of the tutorial, they are included in the .zip or .tar.gz archives of Biopython on our downloads page. For Biopython 1.53 try:

http://biopython.org/DIST/biopython-1.53.zip

http://biopython.org/DIST/biopython-1.53.tar.gz

Hope that helps.