Question: When will the blast program stops giving out hits if there are too many (for web server and standalone)
0
gravatar for johnnytam100
19 months ago by
johnnytam100100
johnnytam100100 wrote:

I have a protein domain of interest.

I want to search for a standalone protein with only that domain as its majority length.

I can think of two methods doing the job:

Method 1: blastp with all nr sequences -> grep result within desired length -> my result

Method 2: grep nr sequences within desired length -> build blast database -> blastp -> my result

I prefer method 2 because I think if I use method 1, the overwhelming number of hits that are not within the desired length will wipe out all the hits I want.

Of course it is easy that I just test the two methods, I just want to know

1) How do you compare the two methods?

2) When will the blast program stops giving out hits if there are too many (for web server and standalone)?

Thank you.

blast • 364 views
ADD COMMENTlink modified 18 months ago by Biostar ♦♦ 20 • written 19 months ago by johnnytam100100

I would go for option 1. much more unbiased and I think quicker then the subsampling approach.

To get all the hits you want be sure to set num_alignments or max_target_seq high enough to get all the hits you want , depending no the input you might also consider raises the e-value threshold

as for you second part of the question: it will stop outputting if either of the thresholds I mentioned above are reached

ADD REPLYlink written 19 months ago by lieven.sterck7.8k

Good point to consider bias! I want to know if blast will anyway output all results within my set constraints, how would num_alignments and max_target_seq affect if I will get all the hits I want?

ADD REPLYlink written 19 months ago by johnnytam100100

theoretically you can set them up to the number of entries in your database, but normally you should not go to that extreme I think.

Running standalone blast with such a small input should not take too long, so you might have the opportunity to try a few values for those parameters

ADD REPLYlink written 19 months ago by lieven.sterck7.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1784 users visited in the last hour