Question: Cuda/Gpu Processing And Bioinformatics
10
gravatar for Matt
8.7 years ago by
Matt110
Berkeley, CA
Matt110 wrote:

Does anybody see any progress on the horizon for leveraging GPU processing against computationally challenging problems-- e.g. de novo assembly.. I know that there are a few threads (older ones) over at seqanswers, and MUMmerGPU is one option for alignment...

assembly hardware • 7.0k views
ADD COMMENTlink modified 8.6 years ago by Alastair Kerr5.2k • written 8.7 years ago by Matt110
16
gravatar for Mrawlins
8.7 years ago by
Mrawlins420
Retirement
Mrawlins420 wrote:

Most of the difficult problems in biology are readily reduced to graph or set problems. They're hard not because these problems have difficult math but because they are on large datasets with lots of inter-dependencies. Your example of de-novo assembly is a good one. Doing assembly on 300 million reads doesn't require any difficult computing, just an awful lot of it. In the worst case it has to compare every read with every other read, which would take years at least. Everything I've seen GPU processing used on is floating-point calculations. Things like protein folding or molecular orbital energy require a lot of complex math, which is why they get such a boost out of using GPUs that are designed to do complex math more quickly than the CPUs.

Most everything I've done in bioinformatics has been memory-limited. GPU processing won't help with that. For the many problems in bioinformatics where memory is the limiting factor, GPU processing won't be worth the effort or the money. I don't see that changing any time in the next few years.

ADD COMMENTlink written 8.7 years ago by Mrawlins420
2

I'll add that there are lots of processes (like short-read mapping) that are disk I/O limited, especially on multi-core machines. GPUs don't help with that either.

ADD REPLYlink written 8.7 years ago by Chris Miller21k
1

i totally agree - it is the quantity of data in 'routine' cases that is the problem, not the nature of the computation

ADD REPLYlink written 8.7 years ago by Andrea_Bio2.5k
3
gravatar for lh3
8.7 years ago by
lh331k
United States
lh331k wrote:

It would be good to study GPU based algorithms as a research project, but I do not see these GPU based algorithms are of much practical uses in sequence analyses in the near future. Here are the reasons:

  1. GPU is costly. To use GPU-based algorithms, you have to put one or two graphical cards at each node in a compute farm. However, only a few programs powered by GPU can benefit from that. If you buy more CPUs using the money for GPUs, all programs will benefit.

  2. GPU is not fast enough even for the same algorithm implemented in different ways. GPU-powered programs are fast, but they are only a few times faster than the best alternative. For example, with one GeForce 8800, GPU-SW (Manavski and Valle, 2007) is about the same speed as SSE2-SW (Farrar, 2006). MummerGPU is only 3X faster than Mummer. I would guess accelerating Mummer with SSE2 (if at all possible) may deliver a similar performance.

  3. Improvements to algorithms are frequently more effective. Still take Smith-Waterman alignment (SW) as the example. The initial version of BWT-SW does not use GPU or SSE2, but it is hundreds times faster than GPU-SW (thousands than the standard implementation). HMMER3 is another example. Although the use of SSE2 is one of the major boosts, improvements to the underlying algorithms also plays an important role. Also, as you said, MummerGPU is an option, but there are much better programs for NGS applications.

Some people are optimistic about GPU-based algorithms because they think a GPU has much more "cores" than a CPU. But in practice, we can hardly get the theoretical performance due to I/O and the restriction of algorithms. Studying GPU based algorithms is more of research interest than of practical use. I see SSE2 is much more promising.

EDIT: I am not qualified to predict the long-term trend of GPU computing. The trend will depend on the evolution of GPUs and many-core CPUs, but I have little idea about that. I can only predict that in a year or two, GPU will be of little practical use in sequence analyses.

EDIT 20101203: This link could be interesting to someone, although the use of DFS worry me a little: using DFS is faster but less accurate.

EDIT 20110110: GPU blast is published. 3-4 fold speedup is gained reportedly. I believe SSE2-powered blast would be faster. Still no sign of GPU computing gaining the ground in sequence analyses.

ADD COMMENTlink modified 8.6 years ago • written 8.7 years ago by lh331k
3

I'm sorry, but I totally disagree. GPU is one of the cheapest methods if you look at the price per GFLOP. It all depends on how well your algorithm is parallelizable in theory and how well this has been done in a certain application. GPU computing can be tremendously powerful, but for most biological application it isn't just yet (for that they will need more memory and maybe some changes in architecture).

ADD REPLYlink written 8.7 years ago by Michael Schubert6.9k
1

Perhaps you are right given 5 to 10 years, but will this happen in a couple of years? I do not see this. Also 64-core CPUs have already been developed in lab. I forget if the architecture is symmetric or if it has other practical problems, but CPUs may also be vastly improved in 5 to 10 years. I know GPU is much cheaper in terms of price per GFLOP, but all that matters is whether we can get the speed in practice. When the few GPU-powered algorithms were published in 2007, I told my friends that they would not be popular in a couple years. I was right that time, although maybe wrong this time.

ADD REPLYlink written 8.7 years ago by lh331k

Another thing: you are using old speed comparisons. Look how the raw performance of CPU vs. GPU changed in the last 2 years. This gap will continue to grow, eventually making GPU computing feasable for a lot of application it isn't just quite ready.

ADD REPLYlink written 8.7 years ago by Michael Schubert6.9k
2
gravatar for Darked89
8.7 years ago by
Darked894.2k
Barcelona, Spain
Darked894.2k wrote:

If you trust benchmarks published by the vendor: http://www.nvidia.com/object/bio_info_life_sciences.html

GPU HMMER looks impresive, but they compare it to HMMER 2.0. HMMER 3.0 is way faster: http://hmmer.janelia.org/

ADD COMMENTlink written 8.7 years ago by Darked894.2k
1
gravatar for Biostar User
7.3 years ago by
Biostar User360
Biostar User360 wrote:

Another project related to GPU and MSA: http://gpualign.cs.put.poznan.pl/

ADD COMMENTlink written 7.3 years ago by Biostar User360
0
gravatar for Alastair Kerr
7.3 years ago by
Alastair Kerr5.2k
The University of Edinburgh, UK
Alastair Kerr5.2k wrote:

Also look biomanycores. A good resource for libraries for the Bio* frameworks.

ADD COMMENTlink written 7.3 years ago by Alastair Kerr5.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1327 users visited in the last hour