Which aligner is more suitable for ONT R10.4.1 Dorado-corrected reads: minimap2 or winnowmap?
2
1
Entering edit mode
19 hours ago
tungsega ▴ 10

Hello everyone,

I currently have ONT R10.4.1 simplex reads corrected by Dorado, and I am considering which aligner would be more suitable — minimap2 or winnowmap.

I plan to perform the following analyses after mapping:

  1. ONT Read phasing, using an existing VCF generated from short reads (from the same sample) and the hg38 reference.
  2. Assembly polishing.
  3. Structural variant calling.

I know that minimap2 already provides the lr:hq and lr:hqae presets specifically designed for ONT R10.4.1 data. On the other hand, winnowmap currently seems to lack equivalent parameters to reproduce the lr:hq preset environment (issue for lr:hq preset discussion).

However, I’m not sure whether minimap2’s latest version has improved performance in repetitive regions compared to previous releases.

I’d really appreciate hearing others’ experiences or opinions: Which aligner would you recommend for ONT R10.4.1 Dorado-corrected reads, especially for downstream phasing and SV detection?

Thank you very much for your time and suggestions!

Best Regards

winnowmap minimap2 R10.4.1 Nanopore • 202 views
ADD COMMENT
0
Entering edit mode

Good summary. I never tried winnowmap despite it's apparent advantages. If you have really repeat rich regions like centromeres that it may be useful. However my gut feeling is that ONT sequencing has moved on again considerably in the last 2 years so the error model of Winnowmap might not be appropriate any more. Perhaps ask the authors ? Certainly, the more modern minimap presets would appear to be beneficial for dealing with the latest ONT data.
If this is just one alignment though, then why not do both and let the community know about your experiences ?

ADD REPLY
0
Entering edit mode

Thank you for your reply.

I’m currently trying to perform alignments using both aligners, but I wanted to ask if anyone has conducted a similar systematic comparison before.

Actually, I’m not quite sure about the best way to evaluate the performance differences between aligners. I was thinking of comparing metrics such as the NM tag, samtools flagstat, or MAPQ values.

Could you please give me some advice or suggestions on how to properly compare the results?

ADD REPLY
0
Entering edit mode
13 hours ago

I believe ONT recommends Minimap2. This is also the default tools in their EPI2ME workflow.

ADD COMMENT
0
Entering edit mode
1 hour ago
tungsega ▴ 10

Just a very brief summary of the results using the latest versions of minimap2 and winnowmap (by samtools stats).

It seems that winnowmap produces fewer secondary and supplementary alignments, as well as fewer mismatches overall.

Any suggestions or thoughts on how to interpret this difference?

Minimap2 (minimap2 -ax lr:hq -t 96)

6877353 0       total (QC-passed reads + QC-failed reads)
4904554 0       primary
1109654 0       secondary
863145  0       supplementary
0       0       duplicates
0       0       primary duplicates
6876784 0       mapped
99.99%  N/A     mapped %
4903985 0       primary mapped
99.99%  N/A     primary mapped %
0       0       paired in sequencing
0       0       read1
0       0       read2
0       0       properly paired
N/A     N/A     properly paired %
0       0       with itself and mate mapped
0       0       singletons
N/A     N/A     singletons %
0       0       with mate mapped to a different chr
0       0       with mate mapped to a different chr (mapQ>=5)

# This file was produced by samtools stats (1.21+htslib-1.21) and can be plotted using plot-bamstats
# This file contains statistics for all reads.
# The command line was:  stats -F 0x900 -@ 64 preprocessed_correct_minimap2.bam
# CHK, Checksum [2]Read Names   [3]Sequences    [4]Qualities
# CHK, CRC32 of reads which passed filtering followed by addition (32bit overflow)
CHK     85705490        7aed57ce        ed2579dd
# Summary Numbers. Use `grep ^SN | cut -f 2-` to extract this part.
SN      raw total sequences:    6877353 # excluding supplementary and secondary reads
SN      filtered sequences:     1972799
SN      sequences:      4904554
SN      is sorted:      1
SN      1st fragments:  4904554
SN      last fragments: 0
SN      reads mapped:   4903985
SN      reads mapped and paired:        0       # paired-end technology bit set + both mates mapped
SN      reads unmapped: 569
SN      reads properly paired:  0       # proper-pair bit set
SN      reads paired:   0       # paired-end technology bit set
SN      reads duplicated:       0       # PCR or optical duplicate bit set
SN      reads MQ0:      37300   # mapped and MQ=0
SN      reads QC failed:        0
SN      non-primary alignments: 0
SN      supplementary alignments:       0
SN      total length:   181205124161    # ignores clipping
SN      total first fragment length:    181205124161    # ignores clipping
SN      total last fragment length:     0       # ignores clipping
SN      bases mapped:   181204951593    # ignores clipping
SN      bases mapped (cigar):   177052477463    # more accurate
SN      bases trimmed:  0
SN      bases duplicated:       0
SN      mismatches:     1820182430      # from NM fields
SN      error rate:     1.028047e-02    # mismatches / bases mapped (cigar)
SN      average length: 36946
SN      average first fragment length:  36946
SN      average last fragment length:   0
SN      maximum length: 417259
SN      maximum first fragment length:  417259
SN      maximum last fragment length:   0
SN      average quality:        255.0
SN      insert size average:    0.0
SN      insert size standard deviation: 0.0
SN      inward oriented pairs:  0
SN      outward oriented pairs: 0
SN      pairs with other orientation:   0
SN      pairs on different chromosomes: 0
SN      percentage of properly paired reads (%):        0.0

Winnowmap2 (winnowmap -W repetitive_k15.txt -ax map-ont -t 96)

6116874 0       total (QC-passed reads + QC-failed reads)
4904554 0       primary
583606  0       secondary
628714  0       supplementary
0       0       duplicates
0       0       primary duplicates
6116058 0       mapped
99.99%  N/A     mapped %
4903738 0       primary mapped
99.98%  N/A     primary mapped %
0       0       paired in sequencing
0       0       read1
0       0       read2
0       0       properly paired
N/A     N/A     properly paired %
0       0       with itself and mate mapped
0       0       singletons
N/A     N/A     singletons %
0       0       with mate mapped to a different chr
0       0       with mate mapped to a different chr (mapQ>=5)

# This file was produced by samtools stats (1.21+htslib-1.21) and can be plotted using plot-bamstats
# This file contains statistics for all reads.
# The command line was:  stats -F 0x900 -@ 36 preprocessed_correct_winnowmap.bam
# CHK, Checksum [2]Read Names   [3]Sequences    [4]Qualities
# CHK, CRC32 of reads which passed filtering followed by addition (32bit overflow)
CHK     85705490        9c76eb9f        ed2579dd
# Summary Numbers. Use `grep ^SN | cut -f 2-` to extract this part.
SN      raw total sequences:    6116874 # excluding supplementary and secondary reads
SN      filtered sequences:     1212320
SN      sequences:      4904554
SN      is sorted:      1
SN      1st fragments:  4904554
SN      last fragments: 0
SN      reads mapped:   4903738
SN      reads mapped and paired:        0       # paired-end technology bit set + both mates mapped
SN      reads unmapped: 816
SN      reads properly paired:  0       # proper-pair bit set
SN      reads paired:   0       # paired-end technology bit set
SN      reads duplicated:       0       # PCR or optical duplicate bit set
SN      reads MQ0:      35758   # mapped and MQ=0
SN      reads QC failed:        0
SN      non-primary alignments: 0
SN      supplementary alignments:       0
SN      total length:   181205124161    # ignores clipping
SN      total first fragment length:    181205124161    # ignores clipping
SN      total last fragment length:     0       # ignores clipping
SN      bases mapped:   181205022091    # ignores clipping
SN      bases mapped (cigar):   177279689107    # more accurate
SN      bases trimmed:  0
SN      bases duplicated:       0
SN      mismatches:     1558339665      # from NM fields
SN      error rate:     8.790289e-03    # mismatches / bases mapped (cigar)
SN      average length: 36946
SN      average first fragment length:  36946
SN      average last fragment length:   0
SN      maximum length: 417259
SN      maximum first fragment length:  417259
SN      maximum last fragment length:   0
SN      average quality:        255.0
SN      insert size average:    0.0
SN      insert size standard deviation: 0.0
SN      inward oriented pairs:  0
SN      outward oriented pairs: 0
SN      pairs with other orientation:   0
SN      pairs on different chromosomes: 0
SN      percentage of properly paired reads (%):        0.0
ADD COMMENT

Login before adding your answer.

Traffic: 2423 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6