Question: Using Bioconductor Biomart to find gene name for sseqid from blastn results
0
gravatar for jannetta.steyn
3.7 years ago by
United Kingdom
jannetta.steyn0 wrote:

Hi All

I need to find a gene name using the sseqid from blastn results: gi|19698730|gb|AC079789.7|

and I want to use the gi or gb id to find a gene name. I have parsed my data into an R dataframe where sseqid1 contains gi and sseqid2 contains gb.

This is my R code:

ensembl <- useMart('ENSEMBL_MART_ENSEMBL', 
                   dataset="cporcellus_gene_ensembl")
keys=as.character(res$sseqid1)
res$genename = getBM(
                   attributes=c('external_gene_name','protein_id',
                   'refseq_mrna_predicted','entrezgene'),
                   values=keys,mart=ensembl)

The last statement results in an error because of the biomart result set. My guess is that it returns more than one result per gi but I can't really figure out what biomart is doing, why it is doing it and what I need to do to fix it.

What I want is a gene name for each sseqid. I don't care whether the gi or the gb is used. I can't find exact definitions for the abbreviations gi and gb and I haven't been able to find out which of the biomart attributes these relate to to. If this information is in the documentation, I have not been able to find it.

Is there anyone that can help me out to get this done?

Thanks in advance. Jannetta

ADD COMMENTlink written 3.7 years ago by jannetta.steyn0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1749 users visited in the last hour