It is my understanding that samtools marks duplicates on the basis of the 5' position of reads and also the orientation of reads. This is based on my reading of the following:
However, I am not sure what exactly is meant by 'orientation' in this context.
In my mind, this can be interpreted in two ways:
1) Whether paired end reads are facing inwards, outwards or in the same direction
2) Everything stated above but also whether the set of first reads from a set of paired end reads map to the same strand or not (i.e. F1R2 and F2R1 nomenclature - cf Orientation of PE reads a review of --fr --ff and --rf meanings)
If anybody has any idea about this, that would be great