Question: Blastp against CAZy database
0
gravatar for huiyus97
3 months ago by
huiyus970
huiyus970 wrote:

Hi,

I want to blast several proteomes against CAZy database on my terminal, and I downloaded the CAZy database from dbCAN2:

http://bcb.unl.edu/dbCAN2/download/

And here is my blast code after building my own database:

$blastp -query input.fasta -db cazydatabase.fa -evalue 1e-5 -outfmt 6 -out output -max_target_seqs 1

I got over 900 sequences in my result with 1e-5. However, other people's work showed 300-400 is the appropriate significant sequences number. Could someone please help me solve this problem?

Thanks in advance!

cazy databse blasp • 155 views
ADD COMMENTlink modified 3 months ago by Mensur Dlakic7.1k • written 3 months ago by huiyus970

I got over 900 sequences in my result with 1e-5. However, other people's work showed 300-400 is the appropriate significant sequences number. Could someone please help me solve this problem?

Your data does not need to show identical results as others. Results are a characteristics of the data going into the analysis. If your data was identical to what others have used (which I assume is not the case) then this would be a problem.

ADD REPLYlink written 3 months ago by genomax92k

Hi, Thank you for your response! I tested using the same proteome (i retrieved it from NCBI) which other people used in their paper, and the result differs a lot.

ADD REPLYlink written 3 months ago by huiyus970
0
gravatar for Mensur Dlakic
3 months ago by
Mensur Dlakic7.1k
USA
Mensur Dlakic7.1k wrote:

Most likely the reason for this problem is the same as in your other post: you are using -outfmt 6 instead of pairwise alignment. Since blast is a local aligner, it will often find multiple high-scoring pair segments (HSPs) between two proteins, rather than a single global alignment. If there are 3 HSPs between a query and its match, that counts as a single hit and will be shown as a single line in pairwise alignment output (though it will be shown as 3 alignments in the alignment part of the output). Since -outfmt 6 doesn't show alignments, that single hit will actually be shown as 3 lines. Even though you are asking only for a top hit with -max_target_seqs 1, it will often show multiple lines because of HSPs. As I suggested to you before, try removing -outfmt 6 from your command-line just to see how that output looks like.

ADD COMMENTlink written 3 months ago by Mensur Dlakic7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1754 users visited in the last hour