Question: Getting Protein Information from NCBI Gene ID
0
gravatar for joseph.orlando
4.6 years ago by
United States
joseph.orlando0 wrote:

Hi there,

So I have several excel files with 3000+ 'feature ID's' from next gen sequencing experiments. The feature ID's look as such:

 

LOC733603
MS4A7
CRISP3
RETN
TNFAIP6
ALPL
MMP8
IRG1
LTF
KCNJ15

HCRTR1

 

Basically, I would like to gather the following information about each of these features for Sus scrofa:

- Gene name

- Gene description

- Protein Name

- Amino acid sequence

I am using python, mainly the urllib2 package, to make HTTP requests to the NCBI gene database.

I can easily get the gene name and gene description by querying NCBI's gene database. I am then trying to use the associated gene ID to query either NCBI's protein database or uniprot but I am not sure what is the wiser approach? Has anyone else had the same scenario and have any useful advice or other ways about obtaining the data I am interested in?

Even easier, is there a way to access the NCBI related protein information with an NCBI gene ID?

 

Joey

python protein ncbi id ncbi gene • 2.7k views
ADD COMMENTlink modified 4.6 years ago by cdsouthan1.8k • written 4.6 years ago by joseph.orlando0
2
gravatar for Elisabeth Gasteiger
4.6 years ago by
Geneva
Elisabeth Gasteiger1.7k wrote:

To obtain information corresponding to these gene symbols from UniProt, I recommend that you read this FAQ:

http://www.uniprot.org/help/gene_symbol_mapping

Once you have your results, you can use the "Columns" button and customize your result table to include columns for gene and protein names and the amino acid sequence:

Query result in html view

Query result in tab-delimited format

Documentation about programmatic access to UniProt

 

ADD COMMENTlink written 4.6 years ago by Elisabeth Gasteiger1.7k

This works perfectly! Thanks so much :)

ADD REPLYlink written 4.6 years ago by joseph.orlando0
1
gravatar for Prash
4.6 years ago by
Prash20
Jaipur, India
Prash20 wrote:

Apart from the above suggestion, you could use Batch Entrez which could provide links for the ids that you upload: 

 

http://www.ncbi.nlm.nih.gov/sites/batchentrez

ADD COMMENTlink modified 4.6 years ago by RamRS25k • written 4.6 years ago by Prash20
0
gravatar for cdsouthan
4.6 years ago by
cdsouthan1.8k
cdsouthan1.8k wrote:

Yet another suggestion would be to approach the mapping via  Ensembl pig  http://www.ensembl.org/Sus_scrofa/Info/Annotation  

ADD COMMENTlink written 4.6 years ago by cdsouthan1.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2112 users visited in the last hour