Looks like no one has actually read the question correctly. PAF is missing lots of information compared to SAM, because it is a summary of alignment, but it also has easy access to the alignment start and stop coordinates, sequences lengths, etc. Since PAF is less information-dense you cannot convert in both directions, but there's no reason why you can't convert from SAM to PAF.
There are several options:
- Use Heng Li's experimental toolkit,
htsbox samview -p in.bam: https://github.com/lh3/htsbox
paftools sam2paf: https://github.com/lh3/minimap2/blob/master/misc/README.md#introduction
- Use this python library: https://bioconvert.readthedocs.io/en/master/_modules/bioconvert/sam2paf.html
I've been looking for the answer to this question, and I am planning on using #2.
I'm not sure how "Matches" are interpreted, but the existence of a sam2paf converter implies that it probably counts all M operations (match "=" or mismatch "X") regardless of the true identity of the reference vs read within the Match block.
There is no direct way to do that. Realignment is necessary, see the developers (Ih3) answer here: https://github.com/lh3/minimap2/issues/493 (the first hit when searching for
convert sam to paf by the way...).
Here is a section that explains how: https://github.com/lh3/miniasm#getting-started