Question: How To Extract Max Score Blast Hit Among A Huge Data Set?
gravatar for HG
4.5 years ago by
HG1.1k wrote:

Hi everyone, I want to extract max score blast sequence among 50 genomic data set : my approach like this:

  1. I did all the denovo assembly of all 50 genome.

  2. Next i predicted all the ORF using Prodigal: (A Microbial Gene Prediction Software)

  3. Now i want to set up a local blast with a query gene sequence vs all 50 genomic ORF.
  4. Finally extract the best blast score sequence among all 50 genome. Can anyone suggest me how to proceed step 3 and step 4

Thank you advance.

blast • 1.4k views
ADD COMMENTlink modified 2.2 years ago by Biostar ♦♦ 20 • written 4.5 years ago by HG1.1k
gravatar for Daniel
4.5 years ago by
Cardiff University
Daniel3.6k wrote:

You can make a blast database of your sequences with the makeblastdb command (in linux):

makeblastdb -in genomesORFs.fas -dbtype nucl -out my_db

Then blast your sequences against it with a command like this, but choose your own parameters (there's loads on info at sites like this and this:

blastall -p blastp -i my_input_file -d my_db -o my_output_file -m 8

You can run it through a perl script to print the best match for whatever parameter you want. Something like this pulled from a different script (NOT THIS):

use strict;
use warnings;

my $BLASTn_file = $ARGV[0] ;

open BLAST, "< $BLASTn_file or die("$BLASTn_file\n$!");
while (my $line = <BLAST>){
    if ( $line !~ /^#/ ) {
            chomp $line;

            my @fields = split(/\t/, $line);
            my $query = $fields[0];
            my $match = $fields[1];
            my $evalue = $fields[10];

            if ((not exists  $Match{$query}) or ($evalue <  $Evalue{$query})){
            # Select lowest eval
                    $Match{$query} = $match;
                    $Evalue{$query} = $evalue;

   while (my ($key, $value) = each (%Match)) {
        print "$key\t$value";
ADD COMMENTlink written 4.5 years ago by Daniel3.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 932 users visited in the last hour