how to replace a list of headers in fasta file that are not in order
1
0
Entering edit mode
2.6 years ago
mthm ▴ 50

That is how my fasta file looks like:

>monCan3F9-B-G1795-Map9
TTTATTATACCCTGAACCCATTAAAA(multiple lines)
>monJX13F48-L-B718-Map1
AAAATTAATTCAGAATTATGTTTG(multiple lines)
.
.
.

the list of new names are not in the same order as in the fasta file, so I have to define it like this e.g. :

monCan3F9-B-G1795-Map9 > BARI1#DNA/Tc1-Mariner

monJX13F48-L-B718-Map1 > PARIS#LTR


. 

.

for a few names, I could do that manually using 'sed' but I don't know how to do it when I have about 1000 of them! I tried to check samtools manual but as far as I understood it requires sam and bam files, is there any other toolkit to do such a thing?

fasta reheader • 1.5k views
ADD COMMENT
1
Entering edit mode

for a few names, I could do that manually using 'sed' but I don't know how to do it when I have about 1000 of them!

use sed with option -f

ADD REPLY
3
Entering edit mode
2.6 years ago

use seqkit rename with file option. Input files are two: a fasta file and a tab separated file. Tab separated file should have following columns: First column name/id/pattern (fasta header) from file 1 and second column new header.

ADD COMMENT
0
Entering edit mode

Thanks for your suggestion. I tried this command:

./seqkit replace -p "(>\S+)" --replacement "{kv}" --kv-file rename.txt test.fa --keep-key > output.fa

returns:

[INFO] read key-value file: rename
[INFO] 170 pairs of key-value loaded

but when I check my output file, names are not changed, I don't understand where I am wrong?

ADD REPLY
1
Entering edit mode

> is not part of the sequence name.

-p "^(\S+)"
ADD REPLY
0
Entering edit mode

thanks it worked

ADD REPLY
0
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.
upvote_bookmark_accept

ADD REPLY

Login before adding your answer.

Traffic: 2174 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6