Question: From FASTQ to clean BAM using GATK tutorial #6484
0
gravatar for lamteva.vera
2.6 years ago by
lamteva.vera180
Ukraine, Kyiv
lamteva.vera180 wrote:

Hi!

I'm trying to build an efficient pipeline for processing amplicon sequencing data. The problem is that ValidateSamFile reveals a bunch of errors in BAM files after running BamClipper (whereas BAMs were free of errors before). Exemplary output of ValidateSamFile (MODE=SUMMARY):

HISTOGRAM   java.lang.String

Error Type  Count

ERROR:INVALID_FLAG_SUPPLEMENTARY_ALIGNMENT  138

ERROR:INVALID_MAPPING_QUALITY   315

ERROR:MISMATCH_FLAG_MATE_UNMAPPED   217

ERROR:MISMATCH_MATE_ALIGNMENT_START 8775

ERROR:MISMATCH_MATE_CIGAR_STRING    2385125

WARNING:MISSING_TAG_NM  2387464

I've read that MergeBamAlignment is a powerful tool for cleaning BAM files while preserving original read information and base quality scores. So I decided to implement the GATK's tutorial #6484 into my analysis pipeline to get rid of the errors.

I just want to ask the community's opinion about the following workflow:

enter image description here

I could have missed something. Any critical thoughts are welcome.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by lamteva.vera180

If I am reading the flow diagram right, why are you adding unaligned BAM data back into final BAM? Isn't that duplicating many reads (aligned and original copy).

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by genomax84k

GATK claims that

Broadly, the tool [MergeBamAlignment] merges defined information from the unmapped BAM (uBAM, step 1) with that of the aligned BAM (step 3) to conserve read data, e.g. original read information and base quality scores.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by lamteva.vera180

I see. Have you compared the merged BAM with the aligned BAM to see what MergeBamAlignment did?

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by genomax84k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2085 users visited in the last hour