Entering edit mode
7.0 years ago
wangdp123
▴
340
Dear Colleague,
I am using blastp in blast+ (ncbi-blast-2.6.0+) to generate the format 6 result but oddly in the resultant file the second column changes the sequence identifiers:
ABC|ABC_DN00001_c0_g1 ABC:ABC_DN00002_c0_g1
ABC|ABC_DN00003_c0_g1 ABC:ABC_DN00004_c0_g1
ABC|ABC_DN00005_c0_g1 ABC:ABC_DN00006_c1_g1
It seems that blastp changed the "|" into ":".
I am not sure if this is due to the usage of "parse_seqids" in the makeblastdb.
makeblastdb -in test.fasta -dbtype prot -parse_seqids
Would you like to help me out about this?
Many thanks,
Best regards,
Tom
What sort of help do you need? Based on what you posted it does look to be the case. If you need the pipe symbol back you could find/replace with
sed
.I wonder what the exact function of -parse_seqids is as the author of blast+ strongly recommend its usage. I tried to remove the -parse_seqids argument in makeblastdb and the ":" disappeared and "|" appeared again. I want to know why this argument led to this difference? Thanks