Hello I am running a simple script with bio python that slices up a nucleotide sequence into chunks of 40
x=range(100,500, 100)
sequence='AAAGGG...some sequence'
export_file=open('_SLQuery.fasta', 'w')
for i in x:
name='query%d' % i
query=sequence[i:i+40]
export_file.write('>'+str(name)+'\n'+str(query)+'\n\n')
export_file.close()
Currently I have to manually enter the sequence from a multi fasta file (500+ records) 1 by 1 and specify a new output file name each time. Any ideas how I could get this script to work by importing the the whole fasta file at once and exporting each query to a different output file(23_SLQuery.fasta, 74_SLQuery.fasta etc....where 23,74 are the record id's). I have tried SeqIO.parse but it still only calls one sequence. I could not figure it out using SeqIO.index any help would be appreciated thanks
thanks but I am not trying to split the original fasta file. I want to read a multi fasta file (contig 1, 2, ...501..etc) for each seq in the file run the script to slice the sequences into 40 nt chunks from 100 upstream to 500 upstream and output those query seq into a seprate Fasta file for each seq in the original fasta. I apologize if I am missing something in your answer.
I see. Should it look like this post-solution?
Python: Slicing Sequences In Fasta File