Split multifasta file in individual sequence file
1
0
Entering edit mode
6.6 years ago
tcf.hcdg ▴ 70

Hello I would like to split multifasta file into the individual file for each sequence in the file. I used the following code and it worked fine with file up to 500 sequences. I tried the same code with 1500 sequences multifasta file. Unfortunately, It didn't work with this and I received the following error message. code I tried?

 awk -F '>' '/^>/ {F=sprintf("%s.fasta", $2); print > F;next;} {print F;}' < dt123_nbxcs.fa  error I received awk: cannot open "gi|353013051|gb|JH237239.1|:7759-7979.fasta" for output (Too many open files)  I wonder how can I do it other than awk? multifasta • 15k views ADD COMMENT 1 Entering edit mode What if you add close(F) after print F; in the final block, i.e. awk -F '>' '/^>/ {F=sprintf("%s.fasta",$2); print > F;next;} {print F; close(F)}' < dt123_nbxcs.fa

0
Entering edit mode

Thanks , It worked What I understand from this it will close after writing the sequence into a new file instead of keeping it in memory. Is it right or it has some other meaning?

0
Entering edit mode
0
Entering edit mode
6.6 years ago

0
Entering edit mode

seqkit split --by-id multi_fasta_file.fasta