I have written code for splitting FASTA into desired number of sequences and desired number of files. But it doesn't write output for the second file or the last file. I couldn't figure it out.
The code is as follows:
file_open=open("sequence.txt","r") # Input file count_file=2 # Number of output files count_sequence=5 # Sequences in each output file sequence=0 # Count total sequences until count_sequences for i in range(count_file,0,-1): # Loop for file names generation sequence_file=open("sequence"+str(i)+".fasta","w") # Output file for line in file_open: # Reading the sequences one by one if line.startswith(">"): fasta="" for data_line in file_open: if data_line.startswith("\n"): break else: fasta=fasta+data_line print(sequence_file) print(line+fasta) sequence_file.write(line+fasta+"\n") # Writing the FASTA sequences to file sequence=sequence+1 # Increment the count of the sequence if sequence==count_sequence: # If number of sequences are equal to the desired number of sequences sequence=0 # reset the counter break
In this case, sequence1.fasta is empty but sequence2.fasta have the first five sequences.
I know that there are many tools mentioned in the forum that do the desired thing but I want the specific format and output extension, that's why I wrote this but couldn't run it.