Question: Benchmarking single SNP mapping
1
gravatar for Josh Herr
4 months ago by
Josh Herr5.6k
University of Nebraska
Josh Herr5.6k wrote:

I've come up with a homework exercise for my class on SNP mapping and benchmarking. The comical part is I can't figure out why I am not mapping the SNP myself. I'm not a novice at this anymore, but I'm banging my head against the wall to get this to work.

For the exercise, I have taken a 13,000 bp reference genomic region for an existing genome. I then changed one base in the reference in a well known gene and synthesized fake reads using wgsim such that the reads have only one single nucleotide difference against the reference.

(I used this command to generate the reads: wgsim -N 200000 -S 0 -e 0 -s 0 -R 0 -X 0 -1 70 -2 70 reference_with_SNP.fasta test_forward.fastq test_reverse.fastq)

Next, using the reference, I generated an index (I'm using bwa here for the example) and then mapped the reads with the SNP to the reference without the SNP: bwa mem reference.fasta test_forward.fastq test_reverse.fastq > test.sam. I'm getting a SAM file with all perfect matches (i.e. 70M). I've modified a lot of the bwa mem flags, but I can't seem to record any SNPs.

If I take a fasta file of my mapping fastq reads I can align the fasta to the reference and see where the bases don't match.

Any ideas if my error is in the mapping of my SNPs (wrong bwa mem flag) or in the read synthesis with wgsim?

mapping bwa wgsim bowtie snps • 182 views
ADD COMMENTlink modified 4 months ago by Biostar ♦♦ 20 • written 4 months ago by Josh Herr5.6k
2

CIGAR M is an alignment match, not necessarily a nucleotide match. I'm not sure if bwa uses extended cigar.

ADD REPLYlink modified 4 months ago • written 4 months ago by WouterDeCoster32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1457 users visited in the last hour