Question: Makeblast error when Nucleotide sequence contanis 'X" character
0
gravatar for zhouyunping
4 days ago by
zhouyunping0 wrote:

hi, i catch a error when used makeblastdb commd for make blast db, the issur as below:

New DB title:  nucl_patent_01
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
FASTA-Reader: Ignoring invalid residues at position(s): On line 34764876: 2

when i sed the line 34764876, and i find there has a 'X' character in the pos of this line, Nucleotide sequence is GXACCTGATGTAGCAGACAGTCTC, what should i do if i want make this Nucleotide sequence into my blast db? the blast version is blast 2.7.1, makeblastcmd is :

makeblastdb -in part-r-00000 -dbtype nucl -title nucl_patent_01 -out /blast_db/nucl_patent_01
sequence • 60 views
ADD COMMENTlink modified 4 days ago by Vijay Lakhujani3.1k • written 4 days ago by zhouyunping0

Replacing X with N. This should work.

seqkit replace -i -s -p X -r N in.fa.gz -o out.fa.gz
ADD REPLYlink written 4 days ago by shenwei3564.1k
0
gravatar for gb
4 days ago by
gb430
gb430 wrote:

You can change the X to a N with something like this:

sed -i '/^>/! s/X/N/g' inputfasta.fa
ADD COMMENTlink modified 4 days ago • written 4 days ago by gb430
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1484 users visited in the last hour