blastp changed the "|" into ":" in the sequence identifier in the second column of the blastp result
0
0
Entering edit mode
7.0 years ago
wangdp123 ▴ 340

Dear Colleague,

I am using blastp in blast+ (ncbi-blast-2.6.0+) to generate the format 6 result but oddly in the resultant file the second column changes the sequence identifiers:

ABC|ABC_DN00001_c0_g1      ABC:ABC_DN00002_c0_g1
ABC|ABC_DN00003_c0_g1      ABC:ABC_DN00004_c0_g1
ABC|ABC_DN00005_c0_g1      ABC:ABC_DN00006_c1_g1

It seems that blastp changed the "|" into ":".

I am not sure if this is due to the usage of "parse_seqids" in the makeblastdb.

makeblastdb -in test.fasta -dbtype prot -parse_seqids

Would you like to help me out about this?

Many thanks,

Best regards,

Tom

blast+ • 1.0k views
ADD COMMENT
0
Entering edit mode

Would you like to help me out about this?

What sort of help do you need? Based on what you posted it does look to be the case. If you need the pipe symbol back you could find/replace with sed.

ADD REPLY
0
Entering edit mode

I wonder what the exact function of -parse_seqids is as the author of blast+ strongly recommend its usage. I tried to remove the -parse_seqids argument in makeblastdb and the ":" disappeared and "|" appeared again. I want to know why this argument led to this difference? Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2228 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6