Question: extract EC number from entrez esearch query
0
gravatar for bioguy
6 months ago by
bioguy30
bioguy30 wrote:

Anyone have any idea how to use NCBI's entrez command line (https://www.ncbi.nlm.nih.gov/books/NBK25501/) to extract feature information about a specific protein query? Specifically, I need to find ECid's for given queries.

For example, if I want to programmatically access the ECID for the following protein (Citrate Synthase, EC_number=2.3.3.16), how do I do so?:

https://www.ncbi.nlm.nih.gov/protein/RRJ88579.1

I've need to do this for a large number of proteins, but for now just getting it for one would be great...I've been using queries like "esearch -db protein -query 'RRJ88579.1' | efetch -format docsum," but this does not return the EC number.

ecid genomics entrez protein ncbi • 240 views
ADD COMMENTlink modified 6 months ago • written 6 months ago by bioguy30
2
gravatar for Pierre Lindenbaum
6 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum121k wrote:

xmllint+xpath

$  wget -O - -q "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=RRJ88579.1&retmode=xml&rettype=gb"   |\
xmllint --xpath '//GBQualifier[GBQualifier_name="EC_number"]/GBQualifier_value/text()' -

2.3.3.16
ADD COMMENTlink written 6 months ago by Pierre Lindenbaum121k
2
gravatar for bioguy
6 months ago by
bioguy30
bioguy30 wrote:

Excellent, thank you.

Alternative method I just found:

esearch -db 'protein' -query 'RRJ88579.1' | efetch -format gpc | xtract -insd Protein EC_number

ADD COMMENTlink written 6 months ago by bioguy30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1696 users visited in the last hour