Using Entrez Utilities to query the Nucleotide database by collection_date
1
1
Entering edit mode
7 weeks ago

Greetings,

I was wondering if there was a way to query the NCBI nucleotide database using E-utilities by collection_date. In the image below I retrieved GenBank file data. Column E is the collection date.

Is the only way to query by collection_date is to download all of the genbank files, save their collection date, and then sort them?

Thank you for the help!

enter image description here

ncbi entrez • 397 views
ADD COMMENT
2
Entering edit mode
7 weeks ago
GenoMax 127k

Using Entrezdirect. Date in last column is collection date. No way to query using it though. Could be wrong.

$ esearch -db nuccore -query "coronavirus" | esummary | xtract -pattern DocumentSummary -element Caption,SubName 

OQ355523    SARS-CoV-2/human/USA/NY_MEDDAC_FD_643/2023|Homo sapiens|SARS-CoV-2|USA|2023-01-09
OQ355522    SARS-CoV-2/human/USA/NY_MEDDAC_FD_642/2023|Homo sapiens|SARS-CoV-2|USA|2023-01-11
OQ355521    SARS-CoV-2/human/USA/NY_MEDDAC_FD_641/2023|Homo sapiens|SARS-CoV-2|USA|2023-01-11
OQ355520    SARS-CoV-2/human/USA/NY_MEDDAC_FD_640/2023|Homo sapiens|SARS-CoV-2|USA|2023-01-09
OQ355519    SARS-CoV-2/human/USA/NY_MEDDAC_FD_639/2023|Homo sapiens|SARS-CoV-2|USA|2023-01-05

Perhaps something like this to look for a specific date

$ esearch -db nuccore -query "coronavirus" | esummary | xtract -pattern DocumentSummary -element Caption,SubName | grep 2023-01-11 
OQ355522    SARS-CoV-2/human/USA/NY_MEDDAC_FD_642/2023|Homo sapiens|SARS-CoV-2|USA|2023-01-11
OQ355521    SARS-CoV-2/human/USA/NY_MEDDAC_FD_641/2023|Homo sapiens|SARS-CoV-2|USA|2023-01-11
OQ353253    SARS-CoV-2/human/USA/NV-CDC-LC0994357/2023|Homo sapiens|SARS-CoV-2|USA: Nevada|Nasal Swabs|2023-01-11
OQ353252    SARS-CoV-2/human/USA/WA-CDC-LC0994972/2023|Homo sapiens|SARS-CoV-2|USA: Washington|Nasal Swabs|2023-01-11
OQ353251    SARS-CoV-2/human/USA/KY-CDC-LC0994843/2023|Homo sapiens|SARS-CoV-2|USA: Kentucky|Nasal Swabs|2023-01-11
OQ353250    SARS-CoV-2/human/USA/WV-CDC-LC0994837/2023|Homo sapiens|SARS-CoV-2|USA: West Virginia|Nasal Swabs|2023-01-11
ADD COMMENT
1
Entering edit mode

That looks pretty useful. I'll give that a try as well. Thank you very kindly for the help.

ADD REPLY
0
Entering edit mode

The number of columns returned by the query in the "|" delimited portion of the data is inconsistent. Is it possible to normalize the number of columns so it's easier to query the results?

enter image description here

ADD REPLY

Login before adding your answer.

Traffic: 1553 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6