Map Between Uniprot Accesion And Gene.Symbol Using R Or/And Mysql
0
0
Entering edit mode
8.9 years ago
jfertaj ▴ 110

Dear list,

I am trying to map between uniprot accesions and gene symbol (Hugo official gene symbol). I've used different R approaches and a mysql approach posted in this list before.

My R approaches use org.Hs.eg and BiomaRt packages. However there are several uniprot accesions that do not map to gene_symobl. I've tried a mysql approach posted by @Pierre Lindenbaum and it solves some cases, but I would like to modify it for obtaining gene_symbols and not ensembl_id, although I could map ensembl_id to gene symbol afterwards.

mysql approach

echo -e "Q7TNF6\nQ53XJ8\nP05787\nP0CG48\nQ96CG1\nD3DR86\nQ96FS5" |awk '{printf("select REF.acc,REF.extAcc1,REF.extAcc2,REF.extAcc3 from uniProt.extDbRef as REF, uniProt.extDb as EXT where EXT.val=\"ENSEMBL\" and EXT.id=REF.extDb and REF.acc=\"%s\";\n",$0);}' |mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -N

R approaches

library('org.Hs.eg')
annotation.col1 <- select(org.Hs.eg.db, keys=c('Q7TNF6','Q53XJ8','P05787','P0CG48','Q96CG1','D3DR86','Q96FS5'), cols=c('UNIPROT', 'SYMBOL', 'ENTREZID'), keytype="UNIPROT")

library('biomaRt')
ensembl <- useMart('ensembl', dataset="hsapiens_gene_ensembl")
annotation <- getBM(attributes=c("uniprot_swissprot_accession", "hgnc_symbol", "uniprot_genename"), filters="uniprot_swissprot_accession", values=c('Q7TNF6','Q53XJ8','P05787','P0CG48','Q96CG1','D3DR86','Q96FS5'), mart=ensembl)

Also I noticed that mapping using `org.Hs.eg' and 'biomaRt' are different in terms of no-matching uniprot accesions.

Thanks a lot

r bioconductor mysql • 9.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 834 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6