Change header of a Fasta file according to the file name
1
1
Entering edit mode
5.8 years ago
Eva_Maria ▴ 180

Hai

I have a Fasta file like GCA_001609185.1_ASM160918v1_genomic.fsa and i want to change header of this fasta file like this >GCA_001609185.1_ASM160918v1_genomic
I am looking for solutions

Thank you

fasta awk • 8.4k views
ADD COMMENT
2
Entering edit mode

There is only one record in the file?

ADD REPLY
2
Entering edit mode

This is kind of off topic, so I won't be overly surprised if someone closes it. Having said that, note that awk has a FILENAME variable that you could use for this purpose.

ADD REPLY
10
Entering edit mode
5.8 years ago
 awk '/^>/ {gsub(/.fa(sta)?$/,"",FILENAME);printf(">%s\n",FILENAME);next;} {print}' input.fa
ADD COMMENT
0
Entering edit mode

Is there a way to do this for multiple (i.e. several hundred) files at once and output as individual files (with the original names) to a subdirectory? Thanks!

ADD REPLY
0
Entering edit mode

Dear Pierre Lindenbaum. I'm a molecular biologist and I have the same question as theclubstyle "

Is there a way to do this for multiple (i.e. several hundred) files at once and output as individual files (with the original names) to a subdirectory?"

I have read the information on the page shared by you, but I am not able to solve the problem with my limited knowledge in bioinformatics.

I'd appreciate your help.

ADD REPLY
1
Entering edit mode

Assuming all your fasta files are in the current directory and you want to write to outdir

Using a bash loop

for FILE in *.fa;
do
 awk '/^>/ {gsub(/.fa(sta)?$/,"",FILENAME);printf(">%s\n",FILENAME);next;} {print}' $FILE > outdir/changed_${FILE}
done

Using GNU parallel

ls *.fa | parallel 'awk '/^>/ {gsub(/.fa(sta)?$/,"",FILENAME);printf(">%s\n",FILENAME);next;} {print}' {} > outdir/changed_{}'
ADD REPLY
0
Entering edit mode

It works! :) . Deeply grateful for your help,

Regards

ADD REPLY
0
Entering edit mode

Hi @WouterDeCoster, This command script is helpful. Thank you. Would you please help me make a script if we want to keep the sentence and add the file name together?

for example, my file name is HP1 and HP2.

HP1 is this:

>NODE_1_length_179136_cov_279.866497
>NODE_2_length_175370_cov_263.866948

HP2 is this:

>NODE_1_length_134626_cov_266.846339
>NODE_2_length_107967_cov_280.028186

I want to change become:

HP1 becomes:

>HP1 NODE_1_length_179136_cov_279.866497

>HP1 NODE_2_length_175370_cov_263.866948

HP2 becomes:

>HP2 NODE_1_length_134626_cov_266.846339

>HP2 NODE_2_length_107967_cov_280.028186

or in the second condition, if possible, become:

HP1 becomes:

>HP1 1

>HP1 2

HP2 becomes:

>HP2 1

>HP2 2

Thank you so much

ADD REPLY
0
Entering edit mode

please post this as a new post with the details and link to this post. @ Ricky

ADD REPLY
0
Entering edit mode

You mean you want to add the filename to the fasta identifier?

ADD REPLY

Login before adding your answer.

Traffic: 1914 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6