Question

What Ever Happened To Alignment Servers?

0

Entering edit mode

11.0 years ago

Jeremy Leipzig 22k

BLAT had (and still has) a program called gfServer which would keep the index in memory. This BLAT server would run as a daemon (even on a separate server) and you would start a BLAT client to run your alignments.

Why can't a NGS aligner be kept running in this mode of operation? Seems it would be a no-brainer for big institutions instead of firing up BWA every time.

alignment blat • 2.3k views

ADD COMMENT • link updated 3.8 years ago by Biostar 20 • written 11.0 years ago by Jeremy Leipzig 22k

1

Entering edit mode

Doesn't STAR do that in part? It can at least leave the genome in memory after exiting so you only have to load it once (that really speeds things up). Sounds like a similar concept.

ADD REPLY • link 11.0 years ago by Devon Ryan 104k

Ram · Answer 1 · 2013-08-16

I think that the data sizes of next-generation sequencing have flipped the expectations of the earlier BLAT world - previously, the queries were small, but the genomes (databases) were large. Now, a bwa-indexed human genome is a few gigabytes, but a HiSeq lane is dozens of gigabytes. Therefore, loading the reference genome isn't the bottleneck at all, so it doesn't save a ton of time to do fancy things to keep it around between runs. For the mammalian-sized things we do, a typical bwa run loads the reference into memory in ~minutes, and the reading and processing of the reads takes ~hours. It helps that the expensive indexing step is only performed once, so the aligner can effectively load the raw index data structure straight into memory. OS-level optimizations, like reusing memory-mapped pages between processes, can help at the margins, but again it doesn't help a ton to optimize something that's only ~1% of the runtime.

Of course, this answer is for the current state of things - future situations where you'd align to dozens or hundreds of genomes simultaneously, or stream reads directly off the sequencer into a mapping server, may necessitate a turn to the kind of ideas you suggest.