Edirect query help: How to determine which querys can be linked to a different data base?
1
0
Entering edit mode
2.2 years ago
john ▴ 130

So the task I want to accomplish is given a set of nuccore ids e.g. {NG_060216.1,NG_033828.1,HG994383.1} which entrys represent a full genome?

My approach so far is:

esearch -db nuccore -query "NG_060216.1,NG_033828.1,HG994383.1" | elink -target genome

which returns

<ENTREZ_DIRECT>
  <Db>genome</Db>
  <WebEnv>MCID_623d7cc3a6be7950e21eb738</WebEnv>
  <QueryKey>3</QueryKey>
  <Count>1</Count>
  <Step>2</Step>
</ENTREZ_DIRECT>

Which shows that only one of the querys can be linked to the genome database. Now the question is what is the command to retrieve the orignal querykeys that can be linked or not linked.

Thanks John

edirect NCBI entrez • 804 views
ADD COMMENT
0
Entering edit mode

You could get an idea of what the accession represents by doing

$ esearch -db nuccore -query "HG994383" | efetch -format docsum | xtract -pattern DocumentSummary -element Title,SubType
Canis lupus genome assembly, chromosome: 1  chromosome

$ esearch -db nuccore -query "NG_060216.1" | efetch -format docsum | xtract -pattern DocumentSummary -element Title,SubType
Macaca nemestrina 5-hydroxytryptamine receptor 3D pseudogene (HTR3D)
ADD REPLY
0
Entering edit mode

But I would like to do this automatically and not by hand.

ADD REPLY
0
Entering edit mode

Above was an example. You will need to create a for loop or some other construct to do this for multiple queries. There is also the epost method that you can lookup to do this.

ADD REPLY
0
Entering edit mode
2.2 years ago
vkkodali_ncbi ★ 3.7k

None of your example accessions "represent a full genome". That said, you can get the scope of what links to a given database can be found here: https://eutils.ncbi.nlm.nih.gov/entrez/query/static/entrezlinks.html

Alternately, you can do something like this:

$ einfo -db nuccore | xtract -pattern Link -element Name,Description,DbTo | head -n5 
nuccore_assembly              Assembly                     assembly
nuccore_assembly_wgscontig    Assembly                     assembly
nuccore_biocollections        BioCollections               biocollections
nuccore_bioproject            Related BioProject entry     bioproject
nuccore_bioproject_reference  Reference Genome BioProject  bioproject

If you are interested in links between nuccore and assembly, you would then run elink as follows:

esearch -db nuccore -query <...> | elink -target assembly -name nuccore_assembly <...>
ADD COMMENT

Login before adding your answer.

Traffic: 2913 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6