Specify Assembly for NCBI Entrez Query
2
2
Entering edit mode
8.5 years ago
paulparsons ▴ 150

Does anyone know if it is possible to specify which assembly to use when constructing a query for Entrez ?

For example, if I do such a query with EDirect:

esearch -db gene -query "brca1 [ALL]human[ORGN]" -sort "relevance" | \
efetch -format docsum | \
xtract -pattern DocumentSummary -element Name MapLocation Description OtherAliases Id \
-block GenomicInfo -element ChrLoc ChrAccVer ChrStart ChrStop


the GenomicInfo that I get seems to be according to the GRCh38 assembly. The application that I'm developing needs to use the GRCh37 assembly, however.

Any help is much appreciated.

entrez ncbi • 4.1k views
5
Entering edit mode
8.5 years ago
paulparsons ▴ 150

Actually, it turns out that this can be done. I wrote to NCBI and got a response.

The key is to look in the LocationHistType block for:

• a specific annotation release (see here and here for some explanation of annotation releases). For example, GRCh37.p13 is coded by NCBI as Annotation Release 105.

• the corresponding assembly accession, which is a RefSeq Assembly ID. For GRCh37.p13 it is GCF_000001405.25 (see here for more information).

The EDirect commands should look something like this:

   esearch -db gene -query "brca1 [ALL] AND human [ORGN]" | \
efetch -format docsum | \
xtract -pattern DocumentSummary \
-element Name MapLocation Description OtherAliases Id ChrLoc \
-block LocationHistType \
-match "AnnotationRelease:105" -and "AssemblyAccVer:GCF_000001405.25" \
-element ChrAccVer ChrStart ChrStop

0
Entering edit mode

That's very valuable to know. Thanks for sharing what you found out!

0
Entering edit mode

Good to know. Technically, this is "retrieve everything and parse for version" rather than "query using version", but whatever works.

0
Entering edit mode

Good point. In my case (although maybe not in all cases), the difference doesn't really matter, and I can get the information that I need. Cheers.

0
Entering edit mode
8.5 years ago
Neilfws 49k

I don't have a reference or source, but I'm certain that EUtils uses the current build, with no option to use previous versions. This came up in a previous question previous question.

0
Entering edit mode

ha, there are some interesting dependencies there that I have never considered before, anyone that relies on eutils is tied to the current release