What Exactly Does Formatdb Do?
1
2
Entering edit mode
11.0 years ago
Griffan ▴ 90

As far as I know, the blast searches for homology in the way that indexing query(word list), scan database(automaton), extension and so on. Then why should we pre-process the database with formatdb.

I know there is another way that firstly indexing k-mer or word in database, and then looking for occurrences in database for each query like blat did and lastz did. And I was told by articles that is why blat is faster than blast, the former is O(1), and the latter is O(G) for time complexity given the database of size G when determining each query's location.

If you can show me a link or give a brief list, that'll be helpful.Thanks.

blast • 2.5k views
ADD COMMENT
0
Entering edit mode

You might ask "what does makeblastdb do?" since formatdb has been superseded by it.

ADD REPLY
0
Entering edit mode
11.0 years ago
Hamish ★ 3.2k

The place to start would be the NCBI C++ Toolkit:

Which includes developer documentation and source code for various NCBI tools, including NCBI BLAST+. This includes details of the internal data structures used in BLAST and details of the BLAST database formats.

For the legacy NCBI BLAST, the equivalent information can be found in the NCBI Software Development Toolkit:

Information about the NCBI BLAST database format can also be found in various other places, for example:

And in third-party programs which read BLAST databases, for example:

Note that other implementations of BLAST use alternative formats for their databases, for example WU-BLAST/AB-BLAST uses the XDF format.

ADD COMMENT

Login before adding your answer.

Traffic: 2908 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6