Ref seq/Gene bank accession to Entrez id for cluster profiler
1
0
Entering edit mode
5 months ago
mail2steff ▴ 70

I have a list id of GenBank accession (protein) for different bacterial species (these are non model bacterial species). The next step is to do GSEA for these proteins. I tried to convert GenBank accession to entrez id, But unfortunalt, I couldnt find the respective id in the NCBI website. Some of the ids are:

> QQR17398.1 QQR17399.1 QQR19149.1 QQR17404.1 QQR17411.1 QQR17417.1
> QQR17418.1 QQR17421.1 QQR17422.1 QQR17448.1 QQR17450.1 QQR17452.1
> QQR17453.1 QQR17457.1 QQR17460.1

How do i convert these ids suitable for ClusterProfiler.

Any help would be really helpful.

IDconversion NCBI ClusterProfiler Genebank • 343 views
ADD COMMENT
1
Entering edit mode
5 months ago
GenoMax 142k

Using EntrezDirect:

$ esearch -db protein -query QQR17398 | esummary | xtract -pattern DocumentSummary -element Id
1957959289

For more than one ID use epost. Put id's in a file one per line and then do:

$ epost -db protein -format acc -input id_file | esummary | xtract -pattern DocumentSummary -element Id,Caption
1957961040      QQR19149
1957959308      QQR17417
1957959302      QQR17411
1957959295      QQR17404
1957959290      QQR17399
1957959289      QQR17398
ADD COMMENT

Login before adding your answer.

Traffic: 2650 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6