Question

Cuda/Gpu Processing And Bioinformatics

10

Entering edit mode

13.5 years ago

Matt ▴ 110

Does anybody see any progress on the horizon for leveraging GPU processing against computationally challenging problems-- e.g. de novo assembly.. I know that there are a few threads (older ones) over at seqanswers, and MUMmerGPU is one option for alignment...

assembly hardware • 9.2k views

ADD COMMENT • link updated 13.3 years ago by Alastair Kerr 5.3k • written 13.5 years ago by Matt ▴ 110

score 16 · Answer 1 · 2010-11-19

Most of the difficult problems in biology are readily reduced to graph or set problems. They're hard not because these problems have difficult math but because they are on large datasets with lots of inter-dependencies. Your example of de-novo assembly is a good one. Doing assembly on 300 million reads doesn't require any difficult computing, just an awful lot of it. In the worst case it has to compare every read with every other read, which would take years at least. Everything I've seen GPU processing used on is floating-point calculations. Things like protein folding or molecular orbital energy require a lot of complex math, which is why they get such a boost out of using GPUs that are designed to do complex math more quickly than the CPUs.

Most everything I've done in bioinformatics has been memory-limited. GPU processing won't help with that. For the many problems in bioinformatics where memory is the limiting factor, GPU processing won't be worth the effort or the money. I don't see that changing any time in the next few years.

score 3 · Answer 2 · 2010-11-20

It would be good to study GPU based algorithms as a research project, but I do not see these GPU based algorithms are of much practical uses in sequence analyses in the near future. Here are the reasons:

GPU is costly. To use GPU-based algorithms, you have to put one or two graphical cards at each node in a compute farm. However, only a few programs powered by GPU can benefit from that. If you buy more CPUs using the money for GPUs, all programs will benefit.
GPU is not fast enough even for the same algorithm implemented in different ways. GPU-powered programs are fast, but they are only a few times faster than the best alternative. For example, with one GeForce 8800, GPU-SW (Manavski and Valle, 2007) is about the same speed as SSE2-SW (Farrar, 2006). MummerGPU is only 3X faster than Mummer. I would guess accelerating Mummer with SSE2 (if at all possible) may deliver a similar performance.
Improvements to algorithms are frequently more effective. Still take Smith-Waterman alignment (SW) as the example. The initial version of BWT-SW does not use GPU or SSE2, but it is hundreds times faster than GPU-SW (thousands than the standard implementation). HMMER3 is another example. Although the use of SSE2 is one of the major boosts, improvements to the underlying algorithms also plays an important role. Also, as you said, MummerGPU is an option, but there are much better programs for NGS applications.

Some people are optimistic about GPU-based algorithms because they think a GPU has much more "cores" than a CPU. But in practice, we can hardly get the theoretical performance due to I/O and the restriction of algorithms. Studying GPU based algorithms is more of research interest than of practical use. I see SSE2 is much more promising.

EDIT: I am not qualified to predict the long-term trend of GPU computing. The trend will depend on the evolution of GPUs and many-core CPUs, but I have little idea about that. I can only predict that in a year or two, GPU will be of little practical use in sequence analyses.

EDIT 20101203: This link could be interesting to someone, although the use of DFS worry me a little: using DFS is faster but less accurate.

EDIT 20110110: GPU blast is published. 3-4 fold speedup is gained reportedly. I believe SSE2-powered blast would be faster. Still no sign of GPU computing gaining the ground in sequence analyses.

score 2 · Answer 3 · 2010-11-20

2

Entering edit mode

13.5 years ago

Darked89 4.6k

If you trust benchmarks published by the vendor: http://www.nvidia.com/object/bio_info_life_sciences.html

GPU HMMER looks impresive, but they compare it to HMMER 2.0. HMMER 3.0 is way faster: http://hmmer.janelia.org/

ADD COMMENT • link 13.5 years ago by Darked89 4.6k

score 1 · Answer 4 · 2012-05-16

1

Entering edit mode

12.0 years ago

Biostar User ▴ 360

Another project related to GPU and MSA: http://gpualign.cs.put.poznan.pl/

ADD COMMENT • link 12.0 years ago by Biostar User ▴ 360

score 0 · Answer 5 · 2012-05-17

0

Entering edit mode

12.0 years ago

Alastair Kerr 5.3k

Also look biomanycores. A good resource for libraries for the Bio* frameworks.

ADD COMMENT • link 12.0 years ago by Alastair Kerr 5.3k