Download nucleotide sequences database
1
0
Entering edit mode
6.8 years ago
arfaj a ▴ 10

Hello,

I want to build a BLAST tool to compare DNA seq with DNA database ex: GenBank.

And I want to store the DNA sequences database, comparison results, and other tables in SQL database.

Nucleotide sequences databases provided by NCBI is not created using tables, they are set of binary files so, I cannot store them in a relational database.

Is there is another place that provide the sequences database as a set of tables?

Hope you getting my point.

Best regards,

Alaa

sequencing alignment sequence blast • 2.5k views
ADD COMMENT
0
Entering edit mode

Hello,

For more clarification,

I want to modify the BLAST code. I want to compare sequence gi value with another parameter (IF statement), In order to write the IF statement, I need to retrieve gi value from the database. How to do this task?

BLAST databases reside on NCBI are binary files, How can I retrieve gi value from the binary file?

I think it is difficult, in my opinion, I need to find another place that provide the BLAST databases in a table format (relational database) to easily browse the database structure and just write a query to retrieve the gi value.

Hope you getting my point

Thanks a lot,
Alaa

ADD REPLY
0
Entering edit mode

So download the database and turn it into table format. Blast manual shows how one can extract information from any NCBI blast db with blastdbcmd. Alternatively you can e.g. download entire GenBank and parse whatever info you need.

ADD REPLY
0
Entering edit mode

Regarding your alternative solution, Ihave no idea about GenBank structure. Is it set of tables? From where can I download it?

Thank you

ADD REPLY
0
0
Entering edit mode

Any idea please

ADD REPLY
1
Entering edit mode
6.8 years ago
biocyberman ▴ 840

For a comment, you need to provide an example of what columns of the table you want to have. It's still unclear what you are aiming for.

And sorry, I may be talking against your will. Unless you want to really use the full sequence for downstream analysis, I think it is not a good idea to store full public DNA sequence in your SQL database. It is enough to have sequence id and version from the BLAST hit. Check help information about BLAST output format (i.e. read output of blastn -help), BLAST does give output as a table.

ADD COMMENT
0
Entering edit mode

I want to use one of the parameters in the DNA database in my BLAST code, which is the sequence modification date.

So, I need to retrieve it from the DNA database. If I download the DNA database to my local computer (and not store it in my SQL database), is it possible to check that variable in my BLAST code?

If yes, from where can I download DNA database that created using tables?

Hope you getting my point

Thanks

ADD REPLY
0
Entering edit mode

NCBI is the biggest sequence database, especially when you are using their BLAST databases. If you can't find inforation there, no other place can give you. BLAST database do not seem to give sequence date, because in many cases, sequence ID and version is enough. Maybe refine your strategy about this. However, if you really really want to proceed with date, dig in there documentation about querying information from NCBI with tools like efetch http://www.ncbi.nlm.nih.gov/books/NBK25499/

ADD REPLY
0
Entering edit mode

You are right about NCBI but, how to retrieve the sequence id and version from binary files? Because the BLAST databases in NCBI are created in a binary files.

If the database created with tables, I can browse the variables easily but, in the binary files I cannot. Even the binary files cannot be opened.

ADD REPLY

Login before adding your answer.

Traffic: 808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6