SortSam Error after BWA alignment
1
0
Entering edit mode
3 months ago
loy_loy • 0

Hi everyone,

I am running an alignment and quality control pipeline. After running bwa (aln or mem), I get an error in Picard SortSam:

"Exception in thread "main" htsjdk.samtools.SAMFormatException: SAM validation error: ERROR::INVALID_MAPPING_QUALITY:Record 18995527, Read name HWI-ST590:214:C1UJ3ACXX:6:1108:8659:37443, MAPQ should be 0 for unmapped read."

I set the validation stringency to "STRICT", I know that I can set it to "LENIENT" to proceed and ignore the error.

My question is, if it is possible that my bwa settings are causing this error. Or does it depend on my data?

This is my code for the bwa alignment:

bwa aln -t 16 ${REF}${FASTQ1} > ${RG_ID}_1.sai && \ bwa aln -t 16${REF} ${FASTQ2} >${RG_ID}_2.sai && \ bwa sampe -r ${READGROUP}${REF} ${RG_ID}_1.sai${RG_ID}_2.sai ${FASTQ1}${FASTQ2} | \ samtools view -Shb -o <output_file> || \ { echo "BWA-ALN failed"; exit 1; }

I also ran bwa mem on the data to see if it changes anything, but I get the same error. I'd like to run bwa aln in this case because the average read length is < 70 bp.

This is my code for Picard SortSam:

java -jar <picard_path> SortSam \ -CREATE_INDEX true \ -INPUT <input_file> \ -SORT_ORDER coordinate \ -OUTPUT <output_file> \ -VALIDATION_STRINGENCY STRICT || \ { echo "Sorting failed"; exit 1; }

Thank you!

Lynn

alignment mappingquality picard sortsam bwa • 311 views
0
Entering edit mode

Update: I got around this error by using Picard CleanSam after BWA alignment. I don't get the error after cleaning and set back the Validation Stringency to "STRICT".

1
Entering edit mode
3 months ago

This is not an error, it's just a difference in how bwa treats some corner cases versus how Picard expects those corner cases to be handled. It's impossible without knowing the exact read what your problem is, but I know this problem can come up when you have a read which aligns to the end of a reference, such that its end falls off the edge. bwa takes all the reference sequences and concatenates them together, so such a read will look like it's straddling two references. bwa will (or at least my version will) give that read its mapping coordinates, and I guess MapQ, but also mark it as "unmapped" as a way of flagging that something is wrong. Picard would object to that combination of unmapped read with a non-zero mapping score.

0
Entering edit mode

Thank you! I guess I should ignore it then and set the validation stringency to "LENIENT" to proceed with my analysis?

Best

Lynn