BWA alt-aware pipeline troubleshooting
0
0
Entering edit mode
9 hours ago
Duy • 0

Hi everyone,

I have a question regarding the BWA alt-aware pipeline since I haven't seen much guideline online other than https://gatk.broadinstitute.org/hc/en-us/articles/360037498992--How-to-Map-reads-to-a-reference-with-alternate-contigs-like-GRCH38#2 (though it might a little bit dated).

Currently, my alt-unaware pipeline would be:

bwa mem | samtools fixmate | samtools sort | samtools markdup

Since my region of interest is the MHC, at first, I tried to pipe bwa-postalt.js (https://github.com/lh3/bwa/tree/master/bwakit) after samtools markdup but it resulted in an error in subsequent samtools sort step (the SAM file will be read-name-sorted after using the js script).

bwa mem | samtools fixmate | samtools sort | samtools markdup | k8 bwa-postalt.js | samtools sort

After a few tries, I decided to break the pipe into two commands and it ran without errors but it would create an intermediate sam file (the input and output of bwa-postalt.js is in SAM format).

bwa mem | samtools fixmate | samtools sort | samtools markdup

k8 bwa-postalt.js | samtools sort

Another pipeline I also tried was to do post-alt processing step after mapping with bwa mem then progress with the rest of the pipeline (adaptation from https://pmc.ncbi.nlm.nih.gov/articles/PMC5522380/).

bwa mem | k8 bwa-postalt.js | samtools fixmate | samtools sort | samtools markdup

So, my question is which pipeline should be used in my analysis since the BAM files created by these two pipelines were not identical though the number of duplicated reads in markdup stats file from both pipelines were roughly the same.

Two-command pipeline:

EXCLUDED: 1174034

EXAMINED: 4792305

DUPLICATE PAIR: 2283688

One-command pipeline:

EXCLUDED: 38478395

EXAMINED: 4792304

DUPLICATE PAIR: 2283694

Since the VCF files generated through both pipelines with HaplotypeCaller produced different sets of variants, I'm having a hard time deciding which pipeline to follow through for my dataset. I'm currently inclining toward the one-command pipeline because it does not produce an intermediate file (and it feels more natural!), but any expert inputs on the matter is more than welcome.

Cheers.

alt-aware BWA samtools • 196 views
ADD COMMENT

Login before adding your answer.

Traffic: 2907 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6