problem with replacing accession number/id only in fasta file using seqkit
0
1
Entering edit mode
2.2 years ago
Priya ▴ 20

i have a task and i am having difficulty in getting the key values out of my key-value file , at the replacement. i have tried several versions of this seqkit replacement command but not able to get the right result. my key-value file:

WP_000014594.1 WP_000014594.1#0001
WP_000025662.1 WP_000025662.1#0001

so the column1 has accession numbers/ids that will find in my fasta file header and i need to replace these with values given in column2.

but im either getting blank space there (command a.)or partial ids like this >_000014594 (command b.)

command a.

$ seqkit replace -p "^(\w+)_(\d+).(\d+)" --replacement "{kv}" -k accesskey_idsvalue1 GCF_000016305.1_ASM1630v1_protein.faa

command b.

$ seqkit replace -p "^(\w+)_(\d+).(\d+)" --replacement '{kv}_${2}' -k accesskey_idsvalue1 GCF_000016305.1_ASM1630v1_protein.faa
seqkit • 698 views
ADD COMMENT
2
Entering edit mode
-p "^(\w+_\d+\.\d+)" -r "{kv}"
ADD REPLY
0
Entering edit mode

if fasta has only OP strings, you can use ^(\w+\.\w) or ^(\w+\.\d) pattern.

ADD REPLY

Login before adding your answer.

Traffic: 1261 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6