Using BLAST to search mongodb
0
0
Entering edit mode
5.7 years ago
munroes • 0

We have a mongo database filled with amino acid sequences and the corresponding dna sequences. I was looking to incorporate "BLAST" in order to locate exact and/or similar sequences in our database. Upon looking at this tool from ncbi, it appears that it works with local files out of the box but I have yet to find an example where I can utilize this with mongo. Has anyone used this or a similar plugin with mongo?

sequencing • 1.3k views
ADD COMMENT
2
Entering edit mode

AFAIK know ability to do blast searches within the DB itself was limited to Oracle. I am not sure how many people used that feature but it was (perhaps still is) there.

I don't think you would be able to use BLAST with data inside your mongo database.

ADD REPLY
1
Entering edit mode

Extract sequences from mongdb, put it in a fasta format. Index the fasta file, and make a script that can read the identifiers from the blast output.

fasta format: A file where the description line starts with ">" en the following line is the sequence.

Index a database: Use the makeblastdb command for this https://www.ncbi.nlm.nih.gov/books/NBK279688/

You can not directly blast against a mongdb, blast needs a specific blastdatabase.

ADD REPLY
0
Entering edit mode

Thanks for the suggestion. That is definitely a solution but might lack in performance. We have thousands of large sequences in our database so that is a lot to grab each time.

ADD REPLY
0
Entering edit mode

But does the mongdb updates that often? Can you not make an extraction once a week overnight?

ADD REPLY
0
Entering edit mode

New sequences are added almost every day

ADD REPLY
1
Entering edit mode

Doing it every night maybe. You can also make a blast database of all the sequences now and at the moment you want to blast you make an extra blast database with only the missing new sequences. And let's say once a week in the weekend you make a new database again with all the sequences and remove the small databases.

You can blast against multiple databases like this:

blastn -query input.fa -db "fulldatabase mondayseqs tuesdayseqs"

Or make an alias file.

ADD REPLY

Login before adding your answer.

Traffic: 1770 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6