Running BLAST on huge files
2
1
Entering edit mode
9.3 years ago
User000 ▴ 690

Hello,

I have quite a simple question.

I have a huge file of 7 GB (approx. 43 million sequences). I want to blast my 1 sequence (1000 bp) against it. Is it possible to do it on a normal computer with 300 GB free space? I get result of only 97 hits found, but I guess that blast may be interrupted due to space or power limit. Also, I tried to divide them in smaller files, but my computer simple gets stuck. Any advice or comment is appreciated.

alignment blast • 3.8k views
ADD COMMENT
1
Entering edit mode

Recently I had been using last for similar tasks - it is really fast and sensitive enough - you build a DB using your single sequence and align 43M of sequences against. while last produces a different from blast output, they provide scripts for format conversion.

ADD REPLY
0
Entering edit mode

Hi User000,

Has your question been answered? We would appreciate your feedback! Please mark answer(s) you found useful if you're satisfied with them.

Thank you!

ADD REPLY
2
Entering edit mode
9.3 years ago

Maybe consider bwa mem:

Align 70bp-1Mbp query sequences with the BWA-MEM algorithm. Briefly, the algorithm works by seeding alignments with maximal exact matches (MEMs) and then extending seeds with the affine-gap Smith-Waterman algorithm (SW).

...

The BWA-MEM algorithm performs local alignment. It may produce multiple primary alignments for different part of a query sequence. This is a crucial feature for long sequences. However, some tools such as Picard's markDuplicates does not work with split alignments. One may consider to use option -M to flag shorter split hits as secondary.

Depending on your sequences, you might want to index the 1000bp and align the 7GB file, or the other way round (is bwa going to choke with 43M reference sequences? Not sure...)

Dario

ADD COMMENT
1
Entering edit mode
9.3 years ago
5heikki 11k

Sounds to me like your blast worked just fine. Good practice would be to create a blast database of your huge file (it's not that big actually) and blast against that though.

ADD COMMENT
0
Entering edit mode

Thank you for your comment. Indeed blast did not show any error, however I needed your opinions to be sure that the results I get are reliable.

ADD REPLY

Login before adding your answer.

Traffic: 2480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6