Question: Makeblast error when Nucleotide sequence contanis 'X" character
0
gravatar for zhouyunping
8 months ago by
zhouyunping0 wrote:

hi, i catch a error when used makeblastdb commd for make blast db, the issur as below:

New DB title:  nucl_patent_01
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
FASTA-Reader: Ignoring invalid residues at position(s): On line 34764876: 2

when i sed the line 34764876, and i find there has a 'X' character in the pos of this line, Nucleotide sequence is GXACCTGATGTAGCAGACAGTCTC, what should i do if i want make this Nucleotide sequence into my blast db? the blast version is blast 2.7.1, makeblastcmd is :

makeblastdb -in part-r-00000 -dbtype nucl -title nucl_patent_01 -out /blast_db/nucl_patent_01
sequence • 243 views
ADD COMMENTlink modified 8 months ago by Vijay Lakhujani4.1k • written 8 months ago by zhouyunping0

Replacing X with N. This should work.

seqkit replace -i -s -p X -r N in.fa.gz -o out.fa.gz
ADD REPLYlink written 8 months ago by shenwei3564.7k
0
gravatar for gb
8 months ago by
gb780
gb780 wrote:

You can change the X to a N with something like this:

sed -i '/^>/! s/X/N/g' inputfasta.fa
ADD COMMENTlink modified 8 months ago • written 8 months ago by gb780
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 573 users visited in the last hour