How can I convert a genbank file to a FASTA file? IOError
Entering edit mode
6.7 years ago


I am trying to convert a genbank file to a FASTA file. I have a problem, it seems like python can't recognizes the genbank codes, I don't know how to solve the problem

To prove that I search for an example found in some website, the script and terminal are below... Thanks for your answers


from Bio import GenBank
gbk_filename = "NC_005213.gbk"
faa_filename = "NC_005213_converted.faa"
input_handle  = open(gbk_filename, "r")
output_handle = open(faa_filename, "w")
for seq_record in SeqIO.parse(input_handle, "genbank") :
    print "Dealing with GenBank record %s" %
    for seq_feature in seq_record.features :
        if seq_feature.type=="CDS" :
            assert len(seq_feature.qualifiers['translation'])==1
            output_handle.write(">%s from %s\n%s\n" % (
print "Done"


albam@albam-TravelMate-P253:~/Desktop$ python
Traceback (most recent call last):
  File "", line 6, in <module>
    input_handle  = open(gbk_filename, "r")
IOError: [Errno 2] No such file or directory: 'NC_005213.gbk'



python biopython genbank IOError • 2.4k views
Entering edit mode

A few suggestions for your script that make use of some better practices:

  •     Do not hard code things, in the module sys, you can specify command line arguments to your scripts!
  •     Instead of individual calls to and file.close, you should use the with open()... syntax. It makes file operations a bit safer since the file will be closed automatically after competion of the with open statement.
  •     Always read the error messages, they contain useful information. For example, the IOError exception provided you with a very useful error message.
    • In general exceptions in programming are used to notify the programmer that the function raising the exception received some value or values not in the domain of that function. In other words, they're there to provide the programmer/user with (usually) useful information about why the program crashed. Generally speaking, you the programmer/user is the one who provided this invalid input. In your case, you provided a path to file that does not exist.
  • You said: "Python does not understand genbank", you're correct, but not in the way you think. Python has no idea of what a genbank file is! BioPython provides several functions that allow you to take the data in a genbank file and format it into another type. You can think of the BioPython code as being the list of python instructions required to convert information in genbank format to information in python values (strings, etc).
Entering edit mode

I am trying to work with a cDNA sequence from genbank, but the problem is that I don't know which function is the correct to open the sequence directly, from python, this sequence is not saved in my computer that's why I need to open it directly, whitout download it.

I don't know if yo understand me...

Thank you

Entering edit mode

You can't open a file that you do not have. You're telling python to find a file under ~/Desktop, but there is no file there. You have to have the file and tell the script where to get it, you can't just pull files out of no where.

Entering edit mode

Can you explain this a little more? I am trying to get the same script to work but  dont like the hard coding of NC_005213.gbk. I want to be able to read many files and have each of them parsed to a new file. Can you explain how this can be done?

Entering edit mode
6.7 years ago
Neilfws 49k

Error suggests the file NC_005213.gbk is not in the directory ~/Desktop.


Login before adding your answer.

Traffic: 1069 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6