Question

Is comparing seeds sufficient, or should alignments be compared instead?

0

Entering edit mode

5 months ago

curiousfellow@123 • 0

In seed-and-extend aligners, the initial seeding phase has a major influence on alignment quality and performance. I'm currently comparing two aligners (or two modes of the same aligner) that differ primarily in their seed generation strategy.

My question is about evaluation:

Is it meaningful to compare just the seeds — e.g., their counts, lengths, or positions — or is it better to compare the final alignments they produce?

I’m leaning toward comparing .sam outputs (e.g., MAPQ, AS, NM, primary/secondary flags, unmapped reads), since not all seeds contribute equally to final alignments. But I’d love to hear from the community:

What are best practices for evaluating seeding strategies?
Is seed-level analysis ever sufficient or meaningful on its own?
What alignment-level metrics are most helpful when comparing the downstream impact of different seeds?

I’m interested in both empirical and theoretical perspectives. Thanks in advance!

readmapping • 562 views

ADD COMMENT • link updated 5 months ago by shenwei356 8.7k • written 5 months ago by curiousfellow@123 • 0

score 0 · Answer 1 · 2025-06-03

Based on my experience in developing an alignment tool, I believe both are important.

In the seeding phase, the priority is ensuring high sensitivity. So you need to do an initial sensitivity assessment, by checking if seeding or chaining results covered the true sequence and positions with both mutated and mutation-free queries.

Ideally, seeds/chains with high scores (more or longer seeds) would also improve the specificity, especially helpful to return only top matches. But the chain with the highest score does not always point to the best alignment, so final alignment is also necessary.