Question: Download nucleotide sequences database
0
gravatar for arfaj a
3.4 years ago by
arfaj a10
arfaj a10 wrote:

Hello,

I want to build a BLAST tool to compare DNA seq with DNA database ex: GenBank.

And I want to store the DNA sequences database, comparison results, and other tables in SQL database.

Nucleotide sequences databases provided by NCBI is not created using tables, they are set of binary files so, I cannot store them in a relational database.

Is there is another place that provide the sequences database as a set of tables?

Hope you getting my point.

Best regards,

Alaa

ADD COMMENTlink modified 3.4 years ago • written 3.4 years ago by arfaj a10
1
gravatar for biocyberman
3.4 years ago by
biocyberman770
Denmark
biocyberman770 wrote:

For a comment, you need to provide an example of what columns of the table you want to have. It's still unclear what you are aiming for.

And sorry, I may be talking against your will. Unless you want to really use the full sequence for downstream analysis, I think it is not a good idea to store full public DNA sequence in your SQL database. It is enough to have sequence id and version from the BLAST hit. Check help information about BLAST output format (i.e. read output of `blastn -help`), BLAST does give output as a table. 

ADD COMMENTlink written 3.4 years ago by biocyberman770

I want to use one of the parameters in the DNA database in my BLAST code, which is the sequence modification date.

So, I need to retrieve it from the DNA database. If I download the DNA database to my local computer (and not store it in my SQL database), is it possible to check that variable in my BLAST code?

If yes, from where can I download DNA database that created using tables?

Hope you getting my point

 

Thanks

ADD REPLYlink written 3.4 years ago by arfaj a10

NCBI is the biggest sequence database, especially when you are using their BLAST databases. If you can't find inforation there, no other place can give you. BLAST database do not seem to give sequence date, because in many cases, sequence ID and version is enough. Maybe refine your strategy about this. However, if you really really want to proceed with date, dig in there documentation about querying information from NCBI with tools like efetch http://www.ncbi.nlm.nih.gov/books/NBK25499/  

ADD REPLYlink written 3.4 years ago by biocyberman770

You are right about NCBI but, how to retrieve the sequence id and version from binary files? Because the BLAST databases in NCBI are created in a binary files.

If the database created with tables, I can browse the variables easily but, in the binary files I cannot. Even the binary files cannot be opened.

 


 

ADD REPLYlink written 3.4 years ago by arfaj a10
0
gravatar for arfaj a
3.4 years ago by
arfaj a10
arfaj a10 wrote:

Hello,

For more clarification,

I want to modify the BLAST code. I want to compare sequence gi value with another parameter (IF statement), In order to write the IF statement, I need to retrieve gi value from the database. How to do this task?

BLAST databases reside on NCBI are binary files, How can I retrieve gi value from the binary file?

I think it is difficult, in my opinion, I need to find another place that provide the BLAST databases in a table format (relational database) to easily browse the database structure and just write a query to retrieve the gi value.

 

Hope you getting my point

 

Thanks a lot,

Alaa

ADD COMMENTlink written 3.4 years ago by arfaj a10

So download the database and turn it into table format. Blast manual shows how one can extract information from any NCBI blast db with blastdbcmd. Alternatively you can e.g. download entire GenBank and parse whatever info you need.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by 5heikki8.4k

Regarding your alternative solution, Ihave no idea about GenBank structure. Is it set of tables? From where can I download it? 

Thank you

ADD REPLYlink written 3.4 years ago by arfaj a10

ftp://ftp.ncbi.nlm.nih.gov/genbank/README.genbank

ADD REPLYlink written 3.4 years ago by 5heikki8.4k
0
gravatar for arfaj a
3.4 years ago by
arfaj a10
arfaj a10 wrote:

Any idea please

ADD COMMENTlink written 3.4 years ago by arfaj a10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1986 users visited in the last hour