I should start out by saying that I'm as new as it gets to both Python and Biopython. I'm trying to split a large .fasta file (with multiple entries) into single files, each with a single entry. I found most of the following code on the Biopython wiki/ Cookbook site, and adapted it just a bit. My problem is that this generator names them as "1.fasta", "2.fasta", etc. and I need them named by some identifier such as GI number.
def batch_iterator(iterator, batch_size) : """Returns lists of length batch_size. This can be used on any iterator, for example to batch up SeqRecord objects from Bio.SeqIO.parse(...), or to batch Alignment objects from Bio.AlignIO.parse(...), or simply lines from a file handle. This is a generator function, and it returns lists of the entries from the supplied iterator. Each list will have batch_size entries, although the final list may be shorter. """ entry = True #Make sure we loop once while entry : batch =  while len(batch) < batch_size : try : entry = next(iterator) except StopIteration : entry = None if entry is None : #End of file break batch.append(entry) if batch : yield batch from Bio import SeqIO infile = input('Which .fasta file would you like to open? ') record_iter = SeqIO.parse(open(infile), "fasta") for i, batch in enumerate(batch_iterator(record_iter, 1)) : outfile = "c:\python32\myfiles\%i.fasta" % (i+1) handle = open(outfile, "w") count = SeqIO.write(batch, handle, "fasta") handle.close() print ("Wrote %i records to %s" % (count, outfile))
If I try to replace:
outfile = "c:\python32\myfiles\%i.fasta" % (i+1)
outfile = "c:\python32\myfiles\%s.fasta" % record_iter.id)
so that it will name something similar to seq_record.id in SeqIO, it gives the following error:
Traceback (most recent call last): File "C:\Python32\myscripts\generator.py", line 33, in <module> outfile = "c:\python32\myfiles\%s.fasta" % record_iter.id) AttributeError: 'generator' object has no attribute 'id'
Although the generator function has no attribute 'id', can I get around this somehow? Is this script too complicated for what I'm trying to do?!? Thanks, Charles