Blastn On Gpu?
6
5
Entering edit mode
11.8 years ago

Does anybody has an experience in running blastn (nucleotide sequence search) on GPU? The freely and easily obtainable tools only provide blastp that is for proteins. I need nucleotide search. As the tool must at least search both direct and reverse complement sequence, just feeding nucleotide sequences to blastp may not be the optimal choice.

blast blastn • 8.6k views
ADD COMMENT
3
Entering edit mode
11.8 years ago
m.s.easton ▴ 30

Hi

Apologies if this is not relevant as I am still very new in this field and trying to get my head around it, but have you seen this paper which provides a GPU optimised BLASTX alternative:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3344842/

Cheers Mark

ADD COMMENT
0
Entering edit mode

It builds, but seems not running for me. So:

ghostm db -i Rice.fasta -o Rice

ghostm qry -i q.fasta -t d -o query

ghostm aln -i query -d Rice -o /dev/stdout -v

(Rice.fasta and q.fasta are nucleotide database and query in FASTA format, and q has direct matches in Rice.fasta - verified with grep). No output is ever received. Even more, if I also specify -D 0 (GPU device ID), it always fails with error:Out of GPU memory (the card has 1 Gb so should not be that soon). But it also fails with exactly the same message if I specify a different GPU id of non-existing device.

If the authors/supporters could post any comments, would be great.

ADD REPLY
3
Entering edit mode
10.6 years ago
OpenHero ▴ 30

You can try GPU blastn. http://www.comp.hkbu.edu.hk/~chxw/software/G-BLASTN.html

ADD COMMENT
2
Entering edit mode
11.8 years ago
Joachim ★ 2.9k

I have not used these particular tools myself, but perhaps they are what you are looking for:

Open Access

Svetlin A. Manavski, Giorgio Valle, "CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment", BMC Bioinformatics 2008, 9(2)

Yongchao Liu, Douglas L Maskell and Bertil Schmidt, "CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units", BMC Research Notes 2009, 2:73

Closed Access

Yang Liu, Wayne Huang, John Johnson and Sheila Vaidya, "GPU Accelerated Smith-Waterman", Lecture Notes in Computer Science, 2006, Volume 3994

ADD COMMENT
0
Entering edit mode

It builds from the source and works with nucleotide sequences, also it is relatively easy to start using (takes FASTA sequences without indexing). However also the new 2.0.8 version seems choking with sequences over somewhat 64 Kb in the input database. Also the total length of the input database appears to be limited.

I have implemented the loop over sequence database so not to feed all 10 Gb at once and added a filter to drop all sequences over 64 Kb (anyway, this is just evaluation). With all these alterations, on the middle range Ti 560 GPU it runs more or less like a single i7 3960 thread, but that CPU has six cores. To be precise, a single CPU thread needs 1 min 54 seconds' to search for sixteen 25 bp sequences in 1.8 Gb.p database. The GPU requires 1 min 23 seconds for this search. It is more or less the same time if I place the database into RAM drive so this is not just a limitation of the hard drive.

Probably a high end card (GTX 690 has about six time more cores) would perform same as a whole i7 3960 CPU that is also high end. The only obvious benefit seems that I can add maybe even two GPU cards to my desktop while I cannot add one more CPU.

ADD REPLY
1
Entering edit mode
11.8 years ago

This paper describes GPU-Blast, which unfortunately is designed for aligning protein sequences and not nucleotides.

Panagiotis D. Vouzis and Nikolaos V. Sahinidis, "GPU-BLAST: using graphics processors to accelerate protein sequence alignment," Vol. 27, no. 2, pages 182-188, Bioinformatics, 2011 (Open Access). Available at: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3018811/

The software is available at http://eudoxus.cheme.cmu.edu/gpublast/gpublast.html

Maybe it can be used to align nucleotide sequences as well, it should not be much different. You should contact the authors and see if you can use it for nucleotides.

ADD COMMENT
0
Entering edit mode
10.6 years ago
OpenHero ▴ 30

There are 4 main steps in blastn. 1.Prepare the hash table with mask data. 2.Scan the hits in the database. And the -thread_num command only useful in this step. 3.Trace back the result in the database. 4.Print the result.

-thread_num command (multi-thread version in step 2) is better than multi-progress. Multi-progress will load database, mask database into RAM by each progress.

Our G-Blastn which speed up the scan step in GPU and speed up the trace back step by SSE, change the framework into pipeline, each step can be overlapped.

You can find the source code and release 1.0 on http://www.comp.hkbu.edu.hk/~chxw/so.../G-BLASTN.html and https://sourceforge.net/projects/gblastn/ https://github.com/OpenHero/gblastn OpenHero is online now Report Post

ADD COMMENT
0
Entering edit mode
10.6 years ago

I thought that blast+ (from NCBI) was designed to be comparable with running treads on a GPU server:

http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download

I think this this tread also agrees?

http://seqanswers.com/forums/showthread.php?t=26085

ADD COMMENT
0
Entering edit mode

Legacy NCBI BLAST and NCBI BLAST+ do not support GPUs. The threading options ('-a' for legacy NCBI BLAST and '-num_threads' for NCBI BLAST+) refer to CPU based threads, and must be used in order for multi-threaded searches to be performed.

ADD REPLY
0
Entering edit mode

Ok - thanks for the feedback

ADD REPLY

Login before adding your answer.

Traffic: 1523 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6