Question: SNP calling,consensus sequence and running Blastn
gravatar for low.weiquan
3.9 years ago by
low.weiquan0 wrote:

Dear esteemed professionals,

After running a simple snp calling pipeline bwa -> mpileup(SAMtools) -> seqtk I have obtained a fasta file that has a single nucleotide sequence as a consensus sequence (for virus). However I noticed that some nucleotides are ambiguous like M and Y for example that can represent multiple nucleotides. So I ran Blastn using a database from my directory which is a fasta file with multiple virus nucleotide sequences that contain no ambiguous nucleotides. However when I open the output file the E-value for all sequences are 0. Is there anyway to replace the ambiguous sequences with highest occuring nucleotide during the snp calling pipeline or an option that I can use for Blastx package that allows proper handling of ambiguous nucleotides. 

Thank you all so much for helping, any advice is greatly appreciated.

The blast process is(on unix):

makeblastdb -in db.fa -dbtype nucl

blastn -query cnc.fa -db db.fa -out result.fa

snp blast • 1.4k views
ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by low.weiquan0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1353 users visited in the last hour