Dear esteemed professionals,
After running a simple snp calling pipeline bwa -> mpileup(SAMtools) -> seqtk I have obtained a fasta file that has a single nucleotide sequence as a consensus sequence (for virus). However I noticed that some nucleotides are ambiguous like M and Y for example that can represent multiple nucleotides. So I ran Blastn using a database from my directory which is a fasta file with multiple virus nucleotide sequences that contain no ambiguous nucleotides. However when I open the output file the E-value for all sequences are 0. Is there anyway to replace the ambiguous sequences with highest occurring nucleotide during the snp calling pipeline or an option that I can use for Blastx package that allows proper handling of ambiguous nucleotides.
Thank you all so much for helping, any advice is greatly appreciated.
The blast process is(on unix):
makeblastdb -in db.fa -dbtype nucl
blastn -query cnc.fa -db db.fa -out result.fa