Getting Refseq ids from Protein IDs
1
0
Entering edit mode
2.6 years ago
Amogh • 0

Hi all, I have a list of Protein IDs ( AVK11642.1 QQV59463.1 ASA16829.1 .......) and I want to get a list of whole genome accession number of the genomes that encode the protein i.e this kind of data:

NZ_LR130537_1
NZ_LR130535_1
NZ_LR130536_1
NZ_LR130534_1
NZ_LR130530_1
NZ_LR130531_1
NZ_LR130527_1

How can I do this in NCBI or eutils tools?

Thank you all

assembly • 697 views
ADD COMMENT
1
Entering edit mode
2.6 years ago
GenoMax 154k

Using EntreDirect:

$ esearch -db protein -query QQV59463 | elink -target nuccore | elink -target assembly | esummary | xtract -pattern DocumentSummary -element Genbank,RefSeq
GCA_016743115.1 GCF_016743115.1
ADD COMMENT
0
Entering edit mode

Hey there, thanks a lot. Sorry I am new to this and I don't know much. But want the accession numbers that begin with NZ_xxxxx.1 I am not sure what they are called. How do I get them from these GCF files.

ADD REPLY

Login before adding your answer.

Traffic: 3664 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6