Question: How to generate a BAM/SAM file to show a mutation using BWA aligner
gravatar for neranjan
19 months ago by
neranjan40 wrote:


I am trying to generate Single nucleotide mutations to a sequence, and trying to get SAM files and its related CIGAR sequences.

My test genome is: (ref.fa)


Then I created a fastq sequence (seq_1.fq) which I inserted a Nucleotide A, to the first line and ran the BWA alignment, then the SAM file

bwa mem ref.fa seq_1.fq >  seq_1.sam

samtools sort -o seq_1.bam --output-fmt BAM seq_1.sam

samtools view seq_1.bam | sam2pairwise

which shows the insertion by giving the CIGAR seq as 29M1I21M:

    genome  0   genome  1   60  29M1I21M    *   0   0
||||||||||||||||||||||||||||| |||||||||||||||||||||

Question: But when I tried to replace a T using G from 'TTTTT' to 'TTGTT' (seq_2.fq) and ran the BWA align and samtools it does not give me the desired CIGAR sequence !!! BUT it just gives it as 50M

 bwa mem ref.fa seq_2.fq >  seq_2.sam 

samtools sort -o seq_2.bam--output-fmt BAM seq_2.sam 

samtools view seq_2.bam | sam2pairwise

genome  0   genome  1   60  50M *   0   0

as it does not recognize the mutation which was introduced to the sequence.

So am missing something here ? How can I generate a CIGAR sequence which states, that there is a mismatch or SNP in the sequence ?

Thank you very much.

bwa snp alignment • 808 views
ADD COMMENTlink modified 19 months ago by h.mon29k • written 19 months ago by neranjan40
gravatar for Pierre Lindenbaum
19 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum128k wrote:

So am missing something here ?

yes. The 'M' operator doesn't mean that there is a perfect match. It means that the two sequences are best aligned if the read and the ref are aligned at this position. You want the derived symbols of 'M' which are '=' (match) and 'X' (mismatch) . See the SAM specification.

see also: the NM tag (edit distance) and How to convert from a cigar string to extended cigar string?

ADD COMMENTlink written 19 months ago by Pierre Lindenbaum128k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1227 users visited in the last hour