Question: Gene And Transcript Id Conversion
1
gravatar for Davy
6.8 years ago by
Davy360
United States
Davy360 wrote:

Hi All,

So I'm trying to basically get a list of all Entrezgenes and their corresponding gene names, and transcript ids. I've tried biomart ensembl and UCSC genome browser but had no luck with either. Ensembl gives me lots of missing rows of data, and ucsc doesn't appear to offer any entrez ids at all. Anyone know where I could download, or query a table of all entrezgenes?

• 2.7k views
ADD COMMENTlink modified 6.8 years ago by Pierre Lindenbaum118k • written 6.8 years ago by Davy360
3
gravatar for Pierre Lindenbaum
6.8 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum118k wrote:

All the information you're looking is available under: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/

ADD COMMENTlink written 6.8 years ago by Pierre Lindenbaum118k

Excuse my ignorance, but any suggestions for the best way to deal with these files? I was looking into parsing them with python, but that seems very DIY for something that should be relatively routine.

ADD REPLYlink written 6.8 years ago by Davy360
1

I would use the ASN1 files, convert it to XML using asn2xml ( http://www.ncbi.nlm.nih.gov/IEB/ToolBox/XML/ncbixml.txt ) and parse it with a (python-based) sax parser

ADD REPLYlink written 6.8 years ago by Pierre Lindenbaum118k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1046 users visited in the last hour