Question: Split multifasta file using awk command
0
gravatar for fec2
16 months ago by
fec230
fec230 wrote:

Hi,

I have a FASTA file and need to split the file into multiple FASTAs, one gene per file. Refer to the post Splitting A Fasta File, I have tried below

awk -F "|" '/^>/ {close(F) ; F = $1".fasta"} {print >> F}' yourfile.fa

However, every output file name contain symbol ">", for example ">my_contig_name.fasta".

May I know how to avoid to have ">" in the output file name? Thanks.

sequence • 688 views
ADD COMMENTlink modified 16 months ago by Jean-Karim Heriche23k • written 16 months ago by fec230
1

Please use the search function, this has been asked many times before:

Split multifasta file in individual sequence file

How to split a multi fasta file into individual chromosomes

splitting multifasta-file in python

Split the multiple sequences file into a separate files

ADD REPLYlink modified 16 months ago • written 16 months ago by ATpoint40k

Hi,

Actually I have tried several command from these posts, but only the above command work for me. However, this command has created ">" in the output name.

ADD REPLYlink written 16 months ago by fec230
2
gravatar for AK
16 months ago by
AK1.9k
AK1.9k wrote:

Try changing the command to:

awk -F "|" '/^>/ {close(F); ID=$1; gsub("^>", "", ID); F=ID".fasta"} {print >> F}' yourfile.fa

If not limited to awk, you can use: seqkit split --by-id yourfile.fa.

ADD COMMENTlink modified 16 months ago • written 16 months ago by AK1.9k

Thank you very much!

ADD REPLYlink written 16 months ago by fec230
1
gravatar for Jean-Karim Heriche
16 months ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche23k wrote:

Try

awk -F "|" '/^>/ {close(F) ; F = substr($1,2,length($1)-1)".fasta"} {print >> F}' yourfile.fa
ADD COMMENTlink written 16 months ago by Jean-Karim Heriche23k

Thank you very much!

ADD REPLYlink written 16 months ago by fec230
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1960 users visited in the last hour