Question: Local Blast Returns No Result, While Using The Website There Are Some Matches
6
gravatar for Zhizhong
8.8 years ago by
Zhizhong270
Zhengzhou
Zhizhong270 wrote:

I want to blast some EST with local blast. However, for some sequences, there are no matches while with the website http://www.arabidopsis.org/Blast/index.jsp.(I download the database from the site, too. so the database I used locally is same to the website.) there are do some matches found. I don't know what is the problem and how can I fix it?

I used blast 2.2.25+, built the database with this command:

makeblastdb -in TAIR10_cdna.fast -out TAIR10_cdna -dbtype nucl -input_type fasta

next I did the blast:

 blastn -query buff.fa -db TAIR10_cdna -out cx274252 -dust yes   -max_target_seqs 250 -penalty -3 -outfmt 4 -gapopen 5 -gapextend 2

the output like this:

    BLASTN 2.2.25+


Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb
Miller (2000), "A greedy algorithm for aligning DNA sequences", J
Comput Biol 2000; 7(1-2):203-14.

Database: TAIR10_cdna.fast
           41,671 sequences; 64,867,051 total letters

Query= CX274252

Length=662

***** No hits found *****

Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 

Effective search space used: 41291330612

  Database: TAIR10_cdna.fast
    Posted date:  May 9, 2011  11:18 PM
  Number of letters in database: 64,867,051
  Number of sequences in database:  41,671

Matrix: blastn matrix 1 -3
Gap Penalties: Existence: 5, Extension: 2

while the results from the website was:

BLASTN 2.2.17 [Aug-26-2007]

Reference:
 Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= CX274252
         (662 letters)

Database: TAIR10 Transcripts (-introns, +UTRs) (DNA) 
           41,671 sequences; 64,867,051 total letters

Searching..................................................done

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT4G27160.1  | Symbols: AT2S3, SESA3 | seed storage albumin ...    64   2e-09
AT4G27140.1  | Symbols: SESA1, AT2S1 | seed storage albumin ...    48   1e-04
AT4G27150.1  | Symbols: SESA2, AT2S2 | seed storage albumin ...    44   0.002
AT1G14170.3  | Symbols:  | RNA-binding KH domain-containing ...    44   0.002
AT1G14170.2  | Symbols:  | RNA-binding KH domain-containing ...    44   0.002
AT1G14170.1  | Symbols:  | RNA-binding KH domain-containing ...    44   0.002
AT4G27170.1  | Symbols: SESA4, AT2S4 | seed storage albumin ...    42   0.009
AT4G00895.1  | Symbols:  | ATPase, F1 complex, OSCP/delta su...    36   0.53 
.............( this  is very long list, so I bypassed some contents)

Database:  TAIR10 Transcripts (-introns, +UTRs) (DNA)
    Posted date:  Jan 13, 2011  1:41 PM
  Number of letters in database: 64,867,051
  Number of sequences in database:  41,671

Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 

Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 5, Extension: 2
Number of Sequences: 41671
Number of Hits to DB: 342,107
Number of extensions: 18701
Number of successful extensions: 1355
Number of sequences better than 10.0: 41
Number of HSP's gapped: 1354
Number of HSP's successfully gapped: 53
Length of query: 662
Length of database: 64,867,051
Length adjustment: 18
Effective length of query: 644
Effective length of database: 64,116,973
Effective search space: 41291330612
Effective search space used: 41291330612
X1: 11 (21.8 bits)
X2: 15 (29.7 bits)
X3: 25 (49.6 bits)
S1: 13 (26.3 bits)
S2: 16 (32.2 bits)

I set the parameters mostly as same as the website setting excepting the weighted matrix and max_score that I don't know how to set them.

blast • 6.8k views
ADD COMMENTlink modified 8.8 years ago • written 8.8 years ago by Zhizhong270
3

Hi Zhizhong and welcome to Biostars. You should probably give more details about what you tried. What kind of blasts did you try? What arethe website and local databases numbers? What are the options you have selected on the site and the command you have used on your computer? What version of the blast algorithm have you used on your machine? Etc. You may even post one result for the same sequence in both cases (server vs. local). Cheers

ADD REPLYlink written 8.8 years ago by Eric Normandeau10k
2

thanks for your reminding, I edited the question and posted all the output results for one same sequence.

ADD REPLYlink written 8.8 years ago by Zhizhong270
1

Most likely answer, assuming you set up local BLAST correctly, is that local and web BLAST used slightly different parameters. As Eric says, we need more details to answer the question.

ADD REPLYlink written 8.8 years ago by Neilfws48k
1

Hi Zhizhong. If you have found your solution, you can take the time to write a clear answer to your own question and mark it as solved. There is nothing against that. Just make sure that both the question and the answer are well formated. This way, it has more chances of being useful to others. Cheers!

ADD REPLYlink written 8.8 years ago by Eric Normandeau10k

Are you using the parameters e-value (0.01), filter, composition statistics as in the database version ? One or more of this parameters can affect your results.

ADD REPLYlink written 8.8 years ago by Khader Shameer18k

Now I fixed it with changing -task blastn and -reward 1. I would like to delete this ask if it is no use to others.

ADD REPLYlink written 8.8 years ago by Zhizhong270
7
gravatar for Zhizhong
8.8 years ago by
Zhizhong270
Zhengzhou
Zhizhong270 wrote:

The problem was caused by using different parameter settings on the website and on the command line.

If I use the following code:

blastn -query buff.fa -db TAIR10_cdna -out cx274252 -task blastn \
    reward 1-dust yes -penalty -3 -gapopen 5 -gapextend 2

then the results are the same.

The difference between this code with the one above in the question mainly lies in the 'task' and 'reward' settings. task corresponds to the weight matrix and reward means the nucleotide match score.

ADD COMMENTlink modified 8.8 years ago by Eric Normandeau10k • written 8.8 years ago by Zhizhong270
1

Hi Zhizhong. Thank you for the solution! Please consider using the editing options in order to make your next questions/answer more readable. Cheers!

ADD REPLYlink written 8.8 years ago by Eric Normandeau10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1896 users visited in the last hour