Question: Makeblast error when Nucleotide sequence contanis 'X" character
0
gravatar for zhouyunping
9 weeks ago by
zhouyunping0 wrote:

hi, i catch a error when used makeblastdb commd for make blast db, the issur as below:

New DB title:  nucl_patent_01
Sequence type: Nucleotide
Keep MBits: T
Maximum file size: 1000000000B
FASTA-Reader: Ignoring invalid residues at position(s): On line 34764876: 2

when i sed the line 34764876, and i find there has a 'X' character in the pos of this line, Nucleotide sequence is GXACCTGATGTAGCAGACAGTCTC, what should i do if i want make this Nucleotide sequence into my blast db? the blast version is blast 2.7.1, makeblastcmd is :

makeblastdb -in part-r-00000 -dbtype nucl -title nucl_patent_01 -out /blast_db/nucl_patent_01
sequence • 120 views
ADD COMMENTlink modified 9 weeks ago by Vijay Lakhujani3.4k • written 9 weeks ago by zhouyunping0

Replacing X with N. This should work.

seqkit replace -i -s -p X -r N in.fa.gz -o out.fa.gz
ADD REPLYlink written 9 weeks ago by shenwei3564.3k
0
gravatar for gb
9 weeks ago by
gb520
gb520 wrote:

You can change the X to a N with something like this:

sed -i '/^>/! s/X/N/g' inputfasta.fa
ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by gb520
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1758 users visited in the last hour