Map Between Uniprot Accesion And Gene.Symbol Using R Or/And Mysql
Entering edit mode
8.9 years ago
jfertaj ▴ 110

Dear list,

I am trying to map between uniprot accesions and gene symbol (Hugo official gene symbol). I've used different R approaches and a mysql approach posted in this list before.

My R approaches use and BiomaRt packages. However there are several uniprot accesions that do not map to gene_symobl. I've tried a mysql approach posted by @Pierre Lindenbaum and it solves some cases, but I would like to modify it for obtaining gene_symbols and not ensembl_id, although I could map ensembl_id to gene symbol afterwards.

mysql approach

echo -e "Q7TNF6\nQ53XJ8\nP05787\nP0CG48\nQ96CG1\nD3DR86\nQ96FS5" |awk '{printf("select REF.acc,REF.extAcc1,REF.extAcc2,REF.extAcc3 from uniProt.extDbRef as REF, uniProt.extDb as EXT where EXT.val=\"ENSEMBL\" and and REF.acc=\"%s\";\n",$0);}' |mysql --user=genome -A -D hg19 -N

R approaches

annotation.col1 <- select(, keys=c('Q7TNF6','Q53XJ8','P05787','P0CG48','Q96CG1','D3DR86','Q96FS5'), cols=c('UNIPROT', 'SYMBOL', 'ENTREZID'), keytype="UNIPROT")

ensembl <- useMart('ensembl', dataset="hsapiens_gene_ensembl")
annotation <- getBM(attributes=c("uniprot_swissprot_accession", "hgnc_symbol", "uniprot_genename"), filters="uniprot_swissprot_accession", values=c('Q7TNF6','Q53XJ8','P05787','P0CG48','Q96CG1','D3DR86','Q96FS5'), mart=ensembl)

Also I noticed that mapping using `' and 'biomaRt' are different in terms of no-matching uniprot accesions.

Thanks a lot

r bioconductor mysql • 9.5k views

Login before adding your answer.

Traffic: 834 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6