Hi All
I have subsets of 100 and 500 reads in FASTA and FASTQ formats. How can I split this one FASTA/FASTQ file with 100 reads into 100 FASTA files containing one sequence read each?
Thank you all!
Hi All
I have subsets of 100 and 500 reads in FASTA and FASTQ formats. How can I split this one FASTA/FASTQ file with 100 reads into 100 FASTA files containing one sequence read each?
Thank you all!
faSplit (linux version linked/ macOS available) from Kent Utilities will take care of the fasta file split.
Instead of "sorting" you may want to change the title to "splitting".
For fastq files you could do: split -l 4 -d -a 500 your_file.fq SEQ
. Use a different word instead of SEQ
to use that as file name PREFIX.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
It's very likely that what you are looking for already exists, but rolling your own code (for example in Python) would be trivial. I guess it would take me longer to search the internet for something then just write it myself. Let me know if you need help with that (but for your own good it's best if you try first on your own to get something working...)
Although similar, FASTA and FASTQ are different file formats. FASTQ contains base quality information in addition the sequence information. If you're splitting a FASTQ into many FASTA, you will be discarding sequence quality information. Is this really what you want to do?