Standalone Blast 2 Short Sequences
Entering edit mode
11.1 years ago
Richard ▴ 580

Hi. I'm new to blast. I have downloaded blast+ and installed on a windows machine.

I want to use blast to align short nucleotide sequences (<400 bases) A against reads of similar length B. Set A and B are only tens of reads each.

I see that I can use formatdb on one set of reads (A) and then use blast+ to match my other set of reads (B).

This is acceptable, but I see that NCBI hosts a tool where you can blast two sequences. Is this possible with the standalone blast without building the database?

Not sure it matters, but I'm doing this on windows, probably with a python wrapper.

blast blast • 12k views
Entering edit mode
11.1 years ago
Hamish ★ 3.2k

The answers to the following questions cover performing an all-against-all with short sequences and the use of NCBI BLAST+ to perform multiple pairwise sequence alignment:

In short you do not need to create a BLAST database (i.e. use 'formatdb' or 'makeblastdb') you can instead run something like (see "BLAST Command Line Applications User Manual"):

blastn -query querySeqSet.tfa -subject targetSeqSet.tfa

To perform all of the required pairwise alignments. Then you'll want to parse the result obtained, see the "Biopython Tutorial and Cookbook: Chapter 7 BLAST" for details of how to do this using BioPython. Since BioPython prefers to parse the BLAST XML output and you are using short sequences you'll want a slightly more complex command-line than the simple example above, something more like:

blastn -query querySeqSet.tfa -task blastn-short -out results.xml -subject targetSeqSet.tfa -outfmt 5 -dust no

Note that depending on the nature of your sequences, and the desired properties of the alignments it may be more appropriate to use alternative tools for generating the alignments as detailed in the answers to the questions linked above.

Entering edit mode
11.1 years ago
Fwip ▴ 490

I'm not familiar with (Bio)python, so here's how I'd do it in BioPerl. To use code stolen from the wiki:

use Bio::Tools::Run::StandAloneBlastPlus;
my $factory = Bio::Tools::Run::StandAloneBlastPlus->new(
    -db_data => 'target.fas',  # Specifies the file to use for database
    -create => 1               # Creates a new temporary database
my $result = $factory->blastn( 
    -query => 'query.fas',                # Specifies the query file     
    -outfile => 'query_vs_target.bls' );  # Specifies the output location
$factory->cleanup;                        # Deletes the temporary database files

If you wanted to analyse the file in that Perl script, $result is an object implementing Bio::Search::Result::ResultI, which has usage examples here.

P.S: StandAloneBlastPlus is not part of the default Bioperl installation, I don't believe. You can download it separately using the CPAN shell without much difficulty.


Login before adding your answer.

Traffic: 2889 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6