The manual of Trim Galore says : Trims 1 bp off every read from its 3' end. This may be needed for FastQ files that are to be aligned as paired-end data with Bowtie. This is because Bowtie (1) regards alignments like this:
R1 --------------------------->
R2 <---------------------------
or this:
R1 ----------------------->
R2 <-----------------
as invalid (whenever a start/end coordinate is contained within the other read).
But I still can't understand why alignment is invalid if a start/end coordinate is contained within the other read?
Could some people explain that one to me?
Thanks a lot.
I'd say by default you'd expect your paired-end reads to have quite large distance between them. Alignment where two reads are overlapping may mean that these reads come from repetitive region - the real distance between reads is large, but since both reads contain information about the same sequence (the region is repetitive), it maps them close to each other. However, <0 insert distance is not unusual for exome sequencing. I think I never faced this problem using bwa-mem.