Hello, Does anyone know how to retrieve organism source and country of isolation from NCBI or Genbank set of data with perl or python scripts?
Hello, Does anyone know how to retrieve organism source and country of isolation from NCBI or Genbank set of data with perl or python scripts?
There was a post before in biostars:
How To Extract Title And /Isolation_Source From A List Of Genbank Accession Numbers
from Bio import SeqIO gb_file = "filename_of_gb.txt"
for gb_record in SeqIO.parse(open(gb_file,"r"), "genbank") : for feat in gb_record.features: if feat.type == 'source': source = gb_record.features[0] for qualifiers in source.qualifiers: if qualifiers == 'isolation_source': isolation_source = source.qualifiers['isolation_source'] C = " Name %s \t Organism %s \t isolation_source %s " % (gb_record.name,gb_record.annotations["source"],isolation_source[0]) f = open('Newfile.xls','a') print(C,file = f) f.close()
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Use the above python script to retrieve isolation source from the save genbank file