Question: Makeblast error when Nucleotide sequence contanis 'X" character
gravatar for zhouyunping
4 days ago by
zhouyunping0 wrote:

hi, i catch a error when used makeblastdb commd for make blast db, the issur as below:

New DB title:  nucl_patent_01
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
FASTA-Reader: Ignoring invalid residues at position(s): On line 34764876: 2

when i sed the line 34764876, and i find there has a 'X' character in the pos of this line, Nucleotide sequence is GXACCTGATGTAGCAGACAGTCTC, what should i do if i want make this Nucleotide sequence into my blast db? the blast version is blast 2.7.1, makeblastcmd is :

makeblastdb -in part-r-00000 -dbtype nucl -title nucl_patent_01 -out /blast_db/nucl_patent_01
sequence • 60 views
ADD COMMENTlink modified 4 days ago by Vijay Lakhujani3.1k • written 4 days ago by zhouyunping0

Replacing X with N. This should work.

seqkit replace -i -s -p X -r N in.fa.gz -o out.fa.gz
ADD REPLYlink written 4 days ago by shenwei3564.1k
gravatar for gb
4 days ago by
gb430 wrote:

You can change the X to a N with something like this:

sed -i '/^>/! s/X/N/g' inputfasta.fa
ADD COMMENTlink modified 4 days ago • written 4 days ago by gb430
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1484 users visited in the last hour