Question: Split multifasta file in individual sequence file
0
gravatar for tcf.hcdg
3.3 years ago by
tcf.hcdg60
European Union
tcf.hcdg60 wrote:

Hello I would like to split multifasta file into the individual file for each sequence in the file. I used the following code and it worked fine with file up to 500 sequences. I tried the same code with 1500 sequences multifasta file. Unfortunately, It didn't work with this and I received the following error message. code I tried?

 awk -F '>' '/^>/ {F=sprintf("%s.fasta", $2); print > F;next;} {print F;}' < dt123_nbxcs.fa

error I received

awk: cannot open "gi|353013051|gb|JH237239.1|:7759-7979.fasta" for output (Too many open files)

I wonder how can I do it other than awk?

multifasta • 4.5k views
ADD COMMENTlink modified 3.3 years ago by Jean-Karim Heriche20k • written 3.3 years ago by tcf.hcdg60
1

What if you add close(F) after print F; in the final block, i.e.

awk -F '>' '/^>/ {F=sprintf("%s.fasta", $2); print > F;next;} {print F; close(F)}' < dt123_nbxcs.fa
ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by 5heikki8.5k

Thanks , It worked What I understand from this it will close after writing the sequence into a new file instead of keeping it in memory. Is it right or it has some other meaning?

ADD REPLYlink written 3.3 years ago by tcf.hcdg60

Pretty much

ADD REPLYlink written 3.3 years ago by 5heikki8.5k
0
gravatar for Jean-Karim Heriche
3.3 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche20k wrote:

This question has already been asked and answered here.

ADD COMMENTlink written 3.3 years ago by Jean-Karim Heriche20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2139 users visited in the last hour