How To Call Variants (Snp, Indel, Sv) On A Bac Contig Aligned With Bwa Mem To A Reference?
0
2
Entering edit mode
10.9 years ago
William ★ 5.3k

I aligned a BAC contig (assembled from sanger sequences) to a reference genome using BWA-mem. The output alignments are very similar to the best end to end alignment I got from aligning the bac contig to the reference with Blat.

The nice thing is that the output is in a Bam file (for visualization and parsing) and that inversions and translocations (of parts of the bac contig vs the reference ) are also supported by BWA-mem.

But how do I now interpret the alignments made by BWA-mem as SNP's, InDels and SV's that the BAC contig has versus the reference?

SNP's and InDels are kind of obvious to see in the data. But because the alignment of the BAC contig is given as multiple separate alignments it is kind of hard to see what is going on SV wise.

I want to use the variants gathered from the BAC sequence to estimate a FN and FP rate for the same strain sequenced and variant called with short read data.

snp indel sv • 3.8k views
ADD COMMENT
0
Entering edit mode

I think that since the MEM method is pretty new there are few tools that handle this type of representation for the alignments.

ADD REPLY
0
Entering edit mode

The output is a valid Bam file so maybe samtools is able to call / extract the snp's and indels. Otherwise I could parse them myself from the cigar strings and the fasta using Picard.

Also the hardclip information for the start and end of each separate alignment gives me the query sequence start and end site for each alignment. Extracting the SNP and InDels is probably the easy part but I am not sure yet how to handle the potential SV's.

ADD REPLY

Login before adding your answer.

Traffic: 2574 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6