Question: extract EC number from entrez esearch query
0
gravatar for bioguy
9 weeks ago by
bioguy20
bioguy20 wrote:

Anyone have any idea how to use NCBI's entrez command line (https://www.ncbi.nlm.nih.gov/books/NBK25501/) to extract feature information about a specific protein query? Specifically, I need to find ECid's for given queries.

For example, if I want to programmatically access the ECID for the following protein (Citrate Synthase, EC_number=2.3.3.16), how do I do so?:

https://www.ncbi.nlm.nih.gov/protein/RRJ88579.1

I've need to do this for a large number of proteins, but for now just getting it for one would be great...I've been using queries like "esearch -db protein -query 'RRJ88579.1' | efetch -format docsum," but this does not return the EC number.

ecid genomics entrez protein ncbi • 171 views
ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by bioguy20
2
gravatar for Pierre Lindenbaum
9 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum116k wrote:

xmllint+xpath

$  wget -O - -q "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=RRJ88579.1&retmode=xml&rettype=gb"   |\
xmllint --xpath '//GBQualifier[GBQualifier_name="EC_number"]/GBQualifier_value/text()' -

2.3.3.16
ADD COMMENTlink written 9 weeks ago by Pierre Lindenbaum116k
2
gravatar for bioguy
9 weeks ago by
bioguy20
bioguy20 wrote:

Excellent, thank you.

Alternative method I just found:

esearch -db 'protein' -query 'RRJ88579.1' | efetch -format gpc | xtract -insd Protein EC_number

ADD COMMENTlink written 9 weeks ago by bioguy20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1756 users visited in the last hour