I am looking for fast implementations of the Needleman–Wunsch algorithm. I used to work with needle (Emboss package), but I recently replaced it with ggsearch (FASTA package), which is nearly two-orders of magnitude faster (maybe because it uses SSE2 instructions? I am not completely sure it does). Is there any new implementations (not listed in Wikipedia) that I should know about?
If your purpose is to search against a protein database, you can hardly find anything better than ggsearch. Ggsearch is based on SSE2 and was written by Michael Farrar who developed the original striped SSE2-SW algorithm. It is the only open-source global aligner so far as I know.
Another notable implementation is swat from phrap. It does global alignment as well. Swat is probably the fastest (or very close to the fastest) SW/NW aligner without SIMD, but it may be tens of times slower than ggsearch.
A problem with needle, if I am right, is that it always fills the trace-back matrix and is thus not suitable search against a protein database. I believe ggsearch and swat compute the score first and only fill the trace-back matrix when they think the alignment score is high enough.
ggsearch36 does use sse2 instructions, but it also achieves a speed-up by not considering for alignment library sequences that are 25% shorter or 33% longer than the query.
It looks very handy! I'll definitively give it a try.