I download from NCBI a multifasta file with 40 viral genomes.
I read the file and tried to separate the files in single genomes like this:
for rec in SeqIO.parse(file, 'fasta'): ids = rec.id.split('|') seqs = rec.seq #check for ids and seqs with len() and print(seqs[:500]) outputfile = open('genome_', + ids + '.fasta') outputfile.write('>' + ids + '\n') outputfile.write(seqs) outputfile.close()
As I said in the comments I printed it out the lengths of the ids and sequences and seems working just ok. But when I checked the files in my dir, some of them (some big genomes) got 0 sequence lengths. However, others are alright.
Some of you guys have any idea why this is happening?
Thanks for your time .
PS- I stated in bold that I got many good files and then I assume that the code is right, however, I just asking why some files doesn't work! The code is easy, I don't receive any error message, but some files got empty.
I just asking if someone here got something like that and what was done to fix it.