I'm having trouble getting BWA mem (v0.7.12) to consistently output all reads in the input fastqs. I'm working on a simulation study, and I'm generating smallish sets of fake reads, then aligning them with bwa. The paired input fastqs have a few thousand reads each. The missing reads are all from the ends of the input fastq files as far as I can tell, and subsetting the input fastqs to focus on reads missing from a previous attempt can make the reads show back up again. It looks a little like under some circumstances BWA is forgetting to flush a buffer, or something like that.
Since I'm simulating reads I know exactly where they should go, and I give them each unique ids so I can verify which ones exist in the fastqs and the output sams. The behavior seems identical on both linux and mac.
Are there circumstances under which BWA will refuse to align / output a read, or set of reads? Everything in the fastqs should end up in the .sam, right? Anyone else notice anything similar?