How to rename sequence headers
1
0
Entering edit mode
16 months ago
Gabriel • 0

Heading

>FileBacMet2.seq7__cadmium-translocating_Ptype_ATPase_[Mycobacterium_wolinskyi]_[ctpD]
TGACCACCCTGGCGAACGCGCCGGCGCCGAGCCGCAGCGCGGCGAGCGCGAGCAGCAGCGCGGGCTGGCTGTGGAGCGTGGCGAGCGTGCGCAGCGCGGCGGGCGCGCTGGGCCTGTTTCTGGCGGGCCTGGCGGCGCAGCTGGCGGGCGCGCCGGAACCGCTGTGGTGGGGC
fasta • 598 views
ADD COMMENT
1
Entering edit mode

I've restored your post from spam - I don't know why it was flagged in the first place, but from the severe lack of context, I can understand why the bot couldn't exclude it from spam. Please give us as much detail as you can.

ADD REPLY
1
Entering edit mode
16 months ago
seidel 11k

You can rename headers in many ways, one of which is using perlrun, a command-line method of evaluating statements in Perl which can take data from stdin, "on the fly". You didn't mention how you want to rename your headers, but assuming you want to simplify them to Sequence ID_organism, you could do something like:

perl -pe 's/>(.+?)__.+?\[(.+?)\].*/>$1_$2/' seq.fa

>FileBacMet2.seq7_Mycobacterium_wolinskyi
TGACCACCCTGGCGAACGCGCCGGCGCCGAGCCGCAGCGCGGCGAGCGCG...

This calls Perl on your file, -p means that every line coming in will be printed out, and -e means evaluate the following expression. The expression is a /match/replace/ expression, and the little s in front means substitute match with replace. The expression uses regex to match a pattern. You can look up the syntax, but the one above translates roughly as follows: find lines starting with ">" then some number of other characters up to a double underscore (this part looks like your sequence ID), then some number of characters upto a square bracket, then match whatever is inside the square brackets, then match anything that comes after that until the end of the line. If you find this pattern, put whatever was found in the first set of parentheses into a variable called $1, and whatever was found in the second set of parentheses into $2. Then we construct the replacement pattern using those variables.

If you search on this site for "rename headers", "fasta headers", etc. you'll find lots of examples.

ADD COMMENT

Login before adding your answer.

Traffic: 1934 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6