Question: Download nucleotide sequences database
0
gravatar for arfaj a
4.4 years ago by
arfaj a10
arfaj a10 wrote:

Hello,

I want to build a BLAST tool to compare DNA seq with DNA database ex: GenBank.

And I want to store the DNA sequences database, comparison results, and other tables in SQL database.

Nucleotide sequences databases provided by NCBI is not created using tables, they are set of binary files so, I cannot store them in a relational database.

Is there is another place that provide the sequences database as a set of tables?

Hope you getting my point.

Best regards,

Alaa

ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by arfaj a10

Hello,

For more clarification,

I want to modify the BLAST code. I want to compare sequence gi value with another parameter (IF statement), In order to write the IF statement, I need to retrieve gi value from the database. How to do this task?

BLAST databases reside on NCBI are binary files, How can I retrieve gi value from the binary file?

I think it is difficult, in my opinion, I need to find another place that provide the BLAST databases in a table format (relational database) to easily browse the database structure and just write a query to retrieve the gi value.

Hope you getting my point

Thanks a lot,
Alaa

ADD REPLYlink modified 5 months ago by RamRS27k • written 4.4 years ago by arfaj a10

So download the database and turn it into table format. Blast manual shows how one can extract information from any NCBI blast db with blastdbcmd. Alternatively you can e.g. download entire GenBank and parse whatever info you need.

ADD REPLYlink modified 4.4 years ago • written 4.4 years ago by 5heikki8.7k

Regarding your alternative solution, Ihave no idea about GenBank structure. Is it set of tables? From where can I download it?

Thank you

ADD REPLYlink modified 5 months ago by RamRS27k • written 4.4 years ago by arfaj a10

ftp://ftp.ncbi.nlm.nih.gov/genbank/README.genbank

ADD REPLYlink written 4.4 years ago by 5heikki8.7k

Any idea please

ADD REPLYlink written 4.4 years ago by arfaj a10
1
gravatar for biocyberman
4.4 years ago by
biocyberman810
Denmark
biocyberman810 wrote:

For a comment, you need to provide an example of what columns of the table you want to have. It's still unclear what you are aiming for.

And sorry, I may be talking against your will. Unless you want to really use the full sequence for downstream analysis, I think it is not a good idea to store full public DNA sequence in your SQL database. It is enough to have sequence id and version from the BLAST hit. Check help information about BLAST output format (i.e. read output of blastn -help), BLAST does give output as a table.

ADD COMMENTlink modified 5 months ago by RamRS27k • written 4.4 years ago by biocyberman810

I want to use one of the parameters in the DNA database in my BLAST code, which is the sequence modification date.

So, I need to retrieve it from the DNA database. If I download the DNA database to my local computer (and not store it in my SQL database), is it possible to check that variable in my BLAST code?

If yes, from where can I download DNA database that created using tables?

Hope you getting my point

Thanks

ADD REPLYlink modified 5 months ago by RamRS27k • written 4.4 years ago by arfaj a10

NCBI is the biggest sequence database, especially when you are using their BLAST databases. If you can't find inforation there, no other place can give you. BLAST database do not seem to give sequence date, because in many cases, sequence ID and version is enough. Maybe refine your strategy about this. However, if you really really want to proceed with date, dig in there documentation about querying information from NCBI with tools like efetch http://www.ncbi.nlm.nih.gov/books/NBK25499/

ADD REPLYlink modified 5 months ago by RamRS27k • written 4.4 years ago by biocyberman810

You are right about NCBI but, how to retrieve the sequence id and version from binary files? Because the BLAST databases in NCBI are created in a binary files.

If the database created with tables, I can browse the variables easily but, in the binary files I cannot. Even the binary files cannot be opened.

ADD REPLYlink modified 5 months ago by RamRS27k • written 4.4 years ago by arfaj a10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2059 users visited in the last hour