sequence renaming
2
1
Entering edit mode
2.2 years ago
zhichusun ▴ 10

I have a fasta file how to rename the sequence according to the name of the file, for example, the sequence name in the gene.fasta file is >1, >2, renamed >gene1, gene2. Thank you very much for telling me

renaming • 795 views
ADD COMMENT
0
Entering edit mode
$ sed -r  '/^>/ s/^./>gene_/' test.fa
ADD REPLY
1
Entering edit mode
2.2 years ago
M__ ▴ 200

Using Linux (any flavour), Unix (OSX)

perl -pi -e 's/^(>[0-9])+.*/\1gene/g' myfile.fa

myfile.fa

>1
AGTC
>2
AGTC
>3
AGTC

output

>1gene
AGTC
>2gene
AGTC
>3gene
AGTC

Its called perl pie (one liner) and is extremely quick at handling massive files. Happy to explain the reg-ex if needed. It will change the file in situ so there's not needed to pipe it, or make a copy, the -i takes care of that.

ADD COMMENT
1
Entering edit mode
2.2 years ago
Mensur Dlakic ★ 27k

This can be done with seqtk:

seqtk rename gene.fasta gene > renamed.fasta
ADD COMMENT
1
Entering edit mode

Also, a simple replace command with perl:

perl -p -e 's/\>/\>gene/g' gene.fasta > renamed.fasta
ADD REPLY

Login before adding your answer.

Traffic: 2774 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6