Question: Deseq2 gene name annotation
gravatar for BM
5.6 years ago by
United Kingdom
BM70 wrote:

I am trying to annotate the Ensemble ID in the Deseq2 results file and add a column of Gene symbols and gene names. I have tried to use Biomart and also AnnotationDb/

This is the ouput


# Loading required package: Rcpp
# Loading required package: RcppArmadillo

counts = read.delim("3mTA2.txt", header=T, row.names=1)
sample <- read.delim("~/sample.txt") <- DESeqDataSetFromMatrix(countData=counts, colData=sample,design= ~ genotype)
res <- results(dds)


columns (
# [1] "ENTREZID"     "PFAM"         "IPI"          "PROSITE"      "ACCNUM"   
# [6] "ALIAS"        "CHR"          "CHRLOC"       "CHRLOCEND"    "ENZYME"   
# [11] "PATH"         "PMID"         "REFSEQ"       "SYMBOL"       "UNIGENE"  
# [21] "GO"           "EVIDENCE"     "ONTOLOGY"     "GOALL"        "EVIDENCEALL"
# [26] "ONTOLOGYALL"  "MGI"      

res$hgnc_symbol <- convertIDs(row.names(res), "ENSEMBL", "SYMBOL",

# Error: could not find function "convertIDs"

convertIDs <- function( ids, from, to, db, ifMultiple=c("putNA", "useFirst")) {
  stopifnot( inherits( db, "AnnotationDb" ) )
  ifMultiple <- match.arg( ifMultiple )
  suppressWarnings( res <- AnnotationDbi::select(
    db, keys=ids, keytype=from, columns=c(from,to) ) )
  if ( ifMultiple == "putNA" ) {
    duplicatedIds <- res[ duplicated( selRes[,1] ), 1 ]
    res <- res[ ! res[,1] %in% duplicatedIds, ]


  return(res[ match( ids, selRes[,1] ), 2 ] )}

res$hgnc_symbol <- convertIDs(row.names(res), "ENSEMBL", "SYMBOL", Error in .testForValidKeys(x, keys, keytype) : None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments. Called from: .testForValidKeys(x, keys, keytype)

library( "biomaRt" )

ensembl = useMart( "ensembl", dataset = "mmusculus_gene_ensembl" )
res$ensembl <- sapply( strsplit( rownames(res), split="nn+" ), "[", 1 )
genemap <- getBM( attributes = c("ensembl_gene_id", "entrezgene", "hgnc_symbol"), 
                  filters = "ensembl_gene_id",
                  values = res$ensembl
                  genemap <- getBM( attributes = c("ensembl_gene_id", "entrezgene", "hgnc_symbol"), 
                                    filters = "ensembl_gene_id",
                                    values = res$ensembl, 
                                    mart = ensembl )
ADD COMMENTlink modified 2.4 years ago by zx87549.9k • written 5.6 years ago by BM70
gravatar for Michael Love
5.6 years ago by
Michael Love2.2k
United States
Michael Love2.2k wrote:

In the latest release, the GenomicFeatures package authors added mapIds() which is straightforward to use.

See ?mapIds after loading the GenomicFeatures package.

ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by Michael Love2.2k

I updated the workflow to use mapIds()

ADD REPLYlink written 5.6 years ago by Michael Love2.2k
gravatar for jimmy_zeng
4.8 years ago by
University of Macau
jimmy_zeng90 wrote:

From Ensemble ID to gene symbol and gene associated name , I don't think you need a function .

In fact,there's enough information in the R package "" , you will find many pre-defined dataset in this package by using ls('')

you can just use ToTable to get two tables toTable(org.Mm.egGENENAME) and toTable(org.Mm.egSYMBOL) , and then you can use merge function to connect this information as you need .

Hope this will help you .

ADD COMMENTlink written 4.8 years ago by jimmy_zeng90
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1485 users visited in the last hour