MyGene.info is a web service that provides up to date annotations in several fields and is great for gene ID conversion. All species from NCBI and Ensembl are supported and annotations are updated weekly to ensure the latest annotations are available. Both python and R/Bioconductor clients are easy to use.
MyGene.info may not be able to solve your problem with Agilent IDs but several other IDs from Genebank, Uniprot, Ensembl, Refseq are all available. Also, from either client, you can query several thousand genes at once.
Here is some example syntax for ID conversion from the python module:
>>> import mygene
>>> mg = mygene.MyGeneInfo()
>>> mg.metadata['available_fields'] ## returns available query terms, pay special attention to "ensemblgene", "entrezgene", "symbol" and "uniprot"
[u'accession', u'alias', u'biocarta', u'chr', u'end', u'ensemblgene', u'ensemblprotein', u'ensembltranscript', u'entrezgene', u'exons', u'flybase', u'generif', u'go', u'hgnc', u'homologene', u'hprd', u'humancyc', u'interpro', u'ipi', u'kegg', u'mgi', u'mim', u'mirbase', u'mousecyc', u'name', u'netpath', u'pdb', u'pfam', u'pharmgkb', u'pid', u'pir', u'prosite', u'ratmap', u'reactome', u'reagent', u'refseq', u'reporter', u'retired', u'rgd', u'smpdb', u'start', u'strand', u'summary', u'symbol', u'tair', u'taxid', u'type_of_gene', u'unigene', u'uniprot', u'wikipathways', u'wormbase', u'xenbase', u'yeastcyc', u'zfin']
>>> xli = ['DDX26B','CCDC83', 'MAST3', 'RPL11', 'ZDHHC20', 'LUC7L3', 'SNORD49A', 'CTSH', 'ACOT8']
>>> mg.querymany(xli, scopes="symbol", fields=["uniprot", "ensembl.gene", "reporter"], species="human", as_dataframe=True)
A DataFrame is returned:
And now for the Bioconductor package:
library(mygene)
xli <-  c('DDX26B','CCDC83',  'MAST3', 'RPL11', 'ZDHHC20',  'LUC7L3',  'SNORD49A',  'CTSH', 'ACOT8')
queryMany(xli, scopes="symbol", fields=c("uniprot", "ensembl.gene", "reporter"), species="human")
This returns a DataFrame:
Finished
DataFrame with 9 rows and 5 columns
                     ensembl.gene         _id uniprot.Swiss-Prot uniprot.TrEMBL       query
                  <CharacterList> <character>        <character>         <List> <character>
1                 ENSG00000165359      203522             Q5JSJ4       ########      DDX26B
2                 ENSG00000150676      220047             Q8IWF9       ########      CCDC83
3                 ENSG00000099308       23031             O60307       ########       MAST3
4                 ENSG00000142676        6135             P62913       ########       RPL11
5                 ENSG00000180776      253832             Q5W0Z9       ########     ZDHHC20
6                 ENSG00000108848       51747             O95232       ########      LUC7L3
7 ENSG00000277370,ENSG00000175061       26800                 NA       ########    SNORD49A
8                 ENSG00000103811        1512             P09668       ########        CTSH
9                 ENSG00000101473       10005             O14734       ########       ACOT8
                    
                
                 
How frequently do you need things updated? DAVID does have yearly releases so far, but their latest release is this month (March 2010). See the release announcement here: http://david.abcc.ncifcrf.gov/forum/cgi-bin/ikonboard.cgi?act=ST;f=10;t=25 This does suggest the underlying mapping framework will be updated along with it in the 6.7 beta, and hence should include more recent information for the conversion tool
Hi
I am faced the same problem.I did differential gene expression by using this protocol "Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown"
I have gene list file after using the ballgown
the gene id in this files is as
I want to perform gene ontology next by using tool AgriGo. these gene ids are not recognized in any database.
I have use the tool bioDBnet to convert these ids into ensembl gene id .but not found result.
I would say that the suffix numbers are Entrez IDs (Gene IDs).