Question: Obtaining the top matches from blast
2
gravatar for S
5.4 years ago by
S90
United States
S90 wrote:

Hi,

I have downloaded the current version of the stand-alone-blast (ncbi-blast-2.2.29+) and I am trying to use blast (blastn) to find similarity of of a group of nucleotide sequences that I have. However, I am interested on only the top 3 matches. I tried searching online and I saw some posts that suggests using -K, but I realized this does not work with the new version that I am using. I looked at the help document and I tried using ( -max_target_seqs) and ( -num_alignments) but none of them worked. The result contains all the matches found by blast.

Does anyone know how to limit the results to let say just top 3 matches?

Thanks!

 

 

blast • 11k views
ADD COMMENTlink modified 6 months ago by sinhaurjoshi0 • written 5.4 years ago by S90

Could you plz explain a bit more about the sorting technique that has been referred to in this thread?

ADD REPLYlink written 6 months ago by sinhaurjoshi0
7
gravatar for hpmcwill
5.4 years ago by
hpmcwill1.1k
United Kingdom
hpmcwill1.1k wrote:

Depends what you are trying to do.

As Neilfws says, if you want to limit the number of hits reported you can use (from the NCBI BLAST+ help output):

 -num_descriptions <Integer, >=0>
   Number of database sequences to show one-line descriptions for
   Not applicable for outfmt > 4
   Default = `500'
    * Incompatible with:  max_target_seqs
 -num_alignments <Integer, >=0>
   Number of database sequences to show alignments for
   Default = `250'
    * Incompatible with:  max_target_seqs

These correspond to the '-v' and '-b' options in legacy NCBI BLAST:

  -v  Number of database sequences to show one-line descriptions for (V) [Integer]
    default = 500
  -b  Number of database sequence to show alignments for (B) [Integer]
    default = 250

 

The '-K' option in legacy NCBI BLAST:

  -K  Number of best hits from a region to keep. Off by default.
If used a value of 100 is recommended.  Very high values of -v or -b is also suggested [Integer]

Is slightly different and maps to the '-culling_limit' parameter in NCBI BLAST+:

 -culling_limit <Integer, >=0>
   If the query range of a hit is enveloped by that of at least this many
   higher-scoring hits, delete the hit
    * Incompatible with:  best_hit_overhang, best_hit_score_edge

You may also want to limit the number of matches reported per hit (i.e. limit the number of HSPs):

 -max_hsps <Integer, >=0>
   Set maximum number of HSPs per subject sequence to save (0 means no limit)
   Default = `0'

For more information about the NCBI BLAST+ command-line options see:

 

 

ADD COMMENTlink written 5.4 years ago by hpmcwill1.1k

Thank you very much hpmcwill !  I am sorry that my post was not clear enough, I was looking to limit the number of matches reported per hit so (-max_hsps) did the job.

 

ADD REPLYlink written 5.4 years ago by S90
3
gravatar for shinken123
2.3 years ago by
shinken12380
México
shinken12380 wrote:

Using the output of blast using the option -outfmt 6

What about:

         awk '!seen[$1]++' Blast_output_file.txt > Besthit_Blast_output_file.txt
ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by shinken12380
2
gravatar for Neilfws
5.4 years ago by
Neilfws48k
Sydney, Australia
Neilfws48k wrote:

The relevant options are in the BLAST handbook:

num_descriptions    integer 500 Show one-line descriptions for this number of database sequences.
num_alignments  integer 250 Show alignments for this number of database sequences.
ADD COMMENTlink modified 4 weeks ago by RamRS24k • written 5.4 years ago by Neilfws48k

Thank you very much for the link!

ADD REPLYlink written 5.4 years ago by S90
1
gravatar for edrezen
5.4 years ago by
edrezen720
France
edrezen720 wrote:

Hi,

What is the output format you use ? I think these options may not work with the default blast output format.

If you try the tabular output format (just add -outfmt 6 to your command), it may work better.

ADD COMMENTlink modified 4 weeks ago by RamRS24k • written 5.4 years ago by edrezen720
1

Changing the format to option 6 didn't help.

ADD REPLYlink written 5.4 years ago by S90
0
gravatar for Whoknows
5.4 years ago by
Whoknows750
Tehran,Iran
Whoknows750 wrote:

Hi

Please run your query with this parameter -outfmt 6 with this you can select those with highest similarity and also you can find out the number of mismatches, Then sort it .

But use this -best_hit_overhang for finding best hit over the blast.

ADD COMMENTlink modified 4 weeks ago by RamRS24k • written 5.4 years ago by Whoknows750

Hi,

Thanks for the response. I knew I could sort and pick the top hit but I just thought there should be a parameter while running blast that can limit the results (at least there was one for an older version).

Thanks!

ADD REPLYlink written 5.4 years ago by S90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1604 users visited in the last hour