Question: How To Add Specific Word To Fasta Header
1
gravatar for Palu
7.7 years ago by
Palu250
Palu250 wrote:

I have more than 5000 fasta sequence in a file and want to add a word , for instance phosphate, to header of all sequence. please tell me a PERL solution for that.

fasta • 7.6k views
ADD COMMENTlink modified 3.9 years ago by arnstrm1.7k • written 7.7 years ago by Palu250
6
gravatar for brentp
7.7 years ago by
brentp23k
Salt Lake City, UT
brentp23k wrote:

inplace:

    perl -pi -e "s/^>/>phosphate-/g" your.fasta

or new file:

    perl -p -e "s/^>/>phosphate-/g" your.fasta > phosphate.fasta

to add it to the end, use this regexp

    's/^(>.*)$/$1-phosphate/g'
ADD COMMENTlink modified 7.7 years ago • written 7.7 years ago by brentp23k

THANKS brentp, but i want to in the last of my header..is there any trick

ADD REPLYlink written 7.7 years ago by Palu250

@palu I edited my answer, see the last line.

ADD REPLYlink written 7.7 years ago by brentp23k

thank you very much sir. for any layman person like me. final code will be like that

perl -p -e "s/^(>.*)$/$1-phosphate/g" your.fasta > phosphate.fasta

ADD REPLYlink written 7.7 years ago by Palu250

except you should use single quotes: perl -p -e 's/^(>.*)$/$1-phosphate/g' in.fasta > out.fasta

ADD REPLYlink written 7.7 years ago by brentp23k

palu, if you like Brent's answer the best, you should select it as such (hover over the votes to do that).

ADD REPLYlink written 7.7 years ago by Neilfws48k

@newlife thanks i do that

ADD REPLYlink written 7.7 years ago by Palu250
4
gravatar for Daniel
7.7 years ago by
Daniel3.7k
Cardiff University
Daniel3.7k wrote:

An easy way with sed:

sed 's/>.*/&_phosphate/' foo.in >bar.out
ADD COMMENTlink written 7.7 years ago by Daniel3.7k
2
gravatar for Brian Bushnell
3.9 years ago by
Walnut Creek, USA
Brian Bushnell16k wrote:

A faster option, from the BBMap package:

bbrename.sh in=file.fasta out=renamed.fasta prefix=phosphate addprefix=t

ADD COMMENTlink written 3.9 years ago by Brian Bushnell16k
0
gravatar for arnstrm
3.9 years ago by
arnstrm1.7k
Ames, IA
arnstrm1.7k wrote:

I know there are lots of option and it can be easily done with many unix one liners, but here is another alternative (my favorite).

bioawk -c fastx '{ print ">PREFIX"$name; $seq }' input.fasta
bioawk -c fastx '{ print ">"$name"|SUFFIX"; $seq }' input.fasta
ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by arnstrm1.7k

Hi, I'm not sure I understand the (bio)awk syntax, but your command was not working for me (did not print sequences)...I put there a new line instead of a semicolon:

 bioawk -c fastx '{ print ">PREFIX" $name "\n" $seq }' input.txt >outupt.txt

which seems to work. Anyway thanks for pointing me towards the solution.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by al-ash100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1847 users visited in the last hour