I have a genbank file with multiple genomes (concatenated using cat). I want to produce a subset of this file with only 'CDS' sequences. It seems very trivial, but i am having trouble with this. Here is the code which succesfully prints each CDS record, but fails to produce a genbank file with those CDS.
from Bio import SeqIO,SeqFeature import sys gbank=SeqIO.parse(open(sys.argv,"rU"),"genbank" for genome in gbank: print "looking in %s" %genome.id) for gene in genome.features: if gene.type == 'CDS': CDS=gene print CDS output_handle=open("all_CDS.gbk","w") SeqIO.write(CDS,output_handle,"genbank") output_handle.close()
The code prints CDS, but it produces following error at the end.
AttributeError: 'int' object has no attribute 'name'